OpenCompass/opencompass/datasets/subjective/flames.py

# flake8: noqa: E501
import json
import os.path as osp
import re
from typing import Optional

from datasets import Dataset, DatasetDict

from opencompass.registry import LOAD_DATASET
from opencompass.utils import get_data_path

from .subjective_cmp import SubjectiveCmpDataset


class Config:

    def __init__(self, flames_config_path, flames_bench_config_name) -> None:
        config_file_path = osp.join(flames_config_path,
                                    flames_bench_config_name)
        with open(config_file_path, 'r') as config_file:
            self.config = ''.join(config_file.readlines())
            config_file.close()


def prompt_construct(sample, config: Config):
    dimensions = config.config
    base_prompt = '{dimensions}'\
        '{question}\n' \
        '回答: '
    prompt = base_prompt.format(dimensions=dimensions,
                                question=sample['prompt'])

    return prompt


@LOAD_DATASET.register_module()
class FlamesDataset(SubjectiveCmpDataset):

    def load(self, path: str, name: str, *args, **kwargs):
        path = get_data_path(path, local_mode=True)
        config = Config(path, f'{name}_config.txt')

        dataset = []
        with open(osp.join(path, f'{name}.json')) as f:
            dataset = json.load(f)
        flames_dataset = []
        for ins in dataset:
            ins['instruction'] = prompt_construct(ins, config)
            ins['judge'] = {
                'dimension': ins['dimension'],
                'subcomponent': ins['subcomponent']
            }
            flames_dataset.append(ins)
        flames_dataset = Dataset.from_list(flames_dataset)
        return flames_dataset
[Feature] add support for Flames datasets (#1093) * add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by: bittersweet1999 <1487910649@qq.com> 2024-04-28 18:56:24 +08:00			`# flake8: noqa: E501`
			`import json`
			`import os.path as osp`
			`import re`
			`from typing import Optional`

			`from datasets import Dataset, DatasetDict`

			`from opencompass.registry import LOAD_DATASET`
[Feature] Support ModelScope datasets (#1289) * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * udpate dataset for modelscope support * update readme * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * update readme * remove tydiqa japanese subset * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * update readme * udpate dataset for modelscope support * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * remove tydiqa japanese subset * update util * remove .DS_Store * fix md format * move util into package * update docs/get_started.md * restore eval_api_zhipu_v2.py, add environment setting * Update dataset * Update * Update * Update * Update --------- Co-authored-by: Yun lin <yunlin@U-Q9X2K4QV-1904.local> Co-authored-by: Yunnglin <mao.looper@qq.com> Co-authored-by: Yun lin <yunlin@laptop.local> Co-authored-by: Yunnglin <maoyl@smail.nju.edu.cn> Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn> 2024-07-29 13:48:32 +08:00			`from opencompass.utils import get_data_path`
[Feature] add support for Flames datasets (#1093) * add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by: bittersweet1999 <1487910649@qq.com> 2024-04-28 18:56:24 +08:00
[Fix] fix Flames (#1599) * fix pip version * fix pip version * fix flames * fix flames 2024-10-12 14:34:59 +08:00			`from .subjective_cmp import SubjectiveCmpDataset`
[Feature] add support for Flames datasets (#1093) * add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by: bittersweet1999 <1487910649@qq.com> 2024-04-28 18:56:24 +08:00

			`class Config:`

			`def __init__(self, flames_config_path, flames_bench_config_name) -> None:`
			`config_file_path = osp.join(flames_config_path,`
			`flames_bench_config_name)`
			`with open(config_file_path, 'r') as config_file:`
			`self.config = ''.join(config_file.readlines())`
			`config_file.close()`


			`def prompt_construct(sample, config: Config):`
			`dimensions = config.config`
			`base_prompt = '{dimensions}'\`
			`'{question}\n' \`
			`'回答: '`
			`prompt = base_prompt.format(dimensions=dimensions,`
			`question=sample['prompt'])`

			`return prompt`


			`@LOAD_DATASET.register_module()`
			`class FlamesDataset(SubjectiveCmpDataset):`

[Fix] fix Flames (#1599) * fix pip version * fix pip version * fix flames * fix flames 2024-10-12 14:34:59 +08:00			`def load(self, path: str, name: str, args, *kwargs):`
[Feature] Support ModelScope datasets (#1289) * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * udpate dataset for modelscope support * update readme * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * update readme * remove tydiqa japanese subset * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * update readme * udpate dataset for modelscope support * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * remove tydiqa japanese subset * update util * remove .DS_Store * fix md format * move util into package * update docs/get_started.md * restore eval_api_zhipu_v2.py, add environment setting * Update dataset * Update * Update * Update * Update --------- Co-authored-by: Yun lin <yunlin@U-Q9X2K4QV-1904.local> Co-authored-by: Yunnglin <mao.looper@qq.com> Co-authored-by: Yun lin <yunlin@laptop.local> Co-authored-by: Yunnglin <maoyl@smail.nju.edu.cn> Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn> 2024-07-29 13:48:32 +08:00			`path = get_data_path(path, local_mode=True)`
[Feature] add support for Flames datasets (#1093) * add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by: bittersweet1999 <1487910649@qq.com> 2024-04-28 18:56:24 +08:00			`config = Config(path, f'{name}_config.txt')`

			`dataset = []`
			`with open(osp.join(path, f'{name}.json')) as f:`
			`dataset = json.load(f)`
			`flames_dataset = []`
			`for ins in dataset:`
			`ins['instruction'] = prompt_construct(ins, config)`
			`ins['judge'] = {`
			`'dimension': ins['dimension'],`
			`'subcomponent': ins['subcomponent']`
			`}`
			`flames_dataset.append(ins)`
			`flames_dataset = Dataset.from_list(flames_dataset)`
			`return flames_dataset`