Peng Bo
07c96ac659
Calm dataset ( #1385 )
...
* Add CALM Dataset
2024-08-01 10:03:21 +08:00
Songyang Zhang
704853e5e7
[Feature] Update pip install ( #1324 )
...
* [Feature] Update pip install
* Update Configuration
* Update
* Update
* Update
* Update Internal Config
* Update collect env
2024-07-29 18:32:50 +08:00
Xingjun.Wang
edab1c07ba
[Feature] Support ModelScope datasets ( #1289 )
...
* add ceval, gsm8k modelscope surpport
* update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest
* update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets
* format file
* format file
* update dataset format
* support ms_dataset
* udpate dataset for modelscope support
* merge myl_dev and update test_ms_dataset
* udpate dataset for modelscope support
* update readme
* update eval_api_zhipu_v2
* remove unused code
* add get_data_path function
* update readme
* remove tydiqa japanese subset
* add ceval, gsm8k modelscope surpport
* update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest
* update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets
* format file
* format file
* update dataset format
* support ms_dataset
* udpate dataset for modelscope support
* merge myl_dev and update test_ms_dataset
* update readme
* udpate dataset for modelscope support
* update eval_api_zhipu_v2
* remove unused code
* add get_data_path function
* remove tydiqa japanese subset
* update util
* remove .DS_Store
* fix md format
* move util into package
* update docs/get_started.md
* restore eval_api_zhipu_v2.py, add environment setting
* Update dataset
* Update
* Update
* Update
* Update
---------
Co-authored-by: Yun lin <yunlin@U-Q9X2K4QV-1904.local>
Co-authored-by: Yunnglin <mao.looper@qq.com>
Co-authored-by: Yun lin <yunlin@laptop.local>
Co-authored-by: Yunnglin <maoyl@smail.nju.edu.cn>
Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2024-07-29 13:48:32 +08:00
bittersweet1999
d3782c1d47
Revert "Calm dataset ( #1287 )" ( #1366 )
...
This reverts commit edd0ffdf70
.
2024-07-26 18:27:29 +08:00
Peng Bo
edd0ffdf70
Calm dataset ( #1287 )
...
* add calm dataset
* modify config max_out_len
* update README
* Modify README
* update README
* update README
* update README
* update README
* update README
* add summarizer and modify readme
* delete summarizer config comment
* update summarizer
* modify same response to all questions
* update README
2024-07-26 11:48:16 +08:00
Que Haoran
a244453d9e
[Feature] Support inference ppl datasets ( #1315 )
...
* commit inference ppl datasets
* revised format
* revise
* revise
* revise
* revise
* revise
* revise
2024-07-22 17:59:30 +08:00
Fengzhe Zhou
a32f21a356
[Sync] Sync with internal codes 2024.06.28 ( #1279 )
2024-06-28 14:16:34 +08:00
jxd
608ff5810d
support CHARM ( https://github.com/opendatalab/CHARM ) reasoning tasks ( #1190 )
...
* support CHARM (https://github.com/opendatalab/CHARM ) reasoning tasks
* fix lint error
* add dataset card for CHARM
* minor refactor
* add txt
---------
Co-authored-by: wujiang <wujiang@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-27 13:48:22 +08:00
bittersweet1999
826d8307ac
fix links ( #1120 )
2024-05-08 15:13:18 +08:00
JuhaoLiang
d2c40e5648
[Feature] Add AceGPT-MMLUArabic benchmark ( #1099 )
...
* add AceGPT-MMLUArabic benchmark
* update readme and fix lint issue
* remove unused package
* add MMLUArabic zero-shot settings
* rename filename and update readme
2024-05-08 15:00:26 +08:00
Yggdrasill7D6
af10ecc272
add mgsm datasets ( #1081 )
...
* add mgsm datasets
* fix lint
* fix lint
* update mgsm
* update mgsm
* ease code spell
* update
* update
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-06 15:29:34 +08:00
klein
153c4fc988
[Feature] update drop dataset from openai simple eval ( #1092 )
...
* [Feature] update drop dataset from openai simple eval
* update drop template presentation
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-06 13:37:08 +08:00
Alexander Lam
35c94d0cde
[Feature] Adding support for LLM Compression Evaluation ( #1108 )
...
* fixed formatting based on pre-commit tests
* fixed typo in comments; reduced the number of models in the eval config
* fixed a bug in LLMCompressionDataset, where setting samples=None would result in passing test[:None] to load_dataset
* removed unnecessary variable in _format_table_pivot; changed lark_reporter message to English
2024-04-30 10:51:01 +08:00
Yggdrasill7D6
58a57a4c45
[Feature] add support for Flames datasets ( #1093 )
...
* add flames datasets
* fix lint
* rm quota
* add judgemodel info and fix os path
* support flames dataset
* support flames dataset
---------
Co-authored-by: bittersweet1999 <1487910649@qq.com>
2024-04-28 18:56:24 +08:00
liuwei130
a00e57296f
[Feature] Add ChemBench ( #1032 )
...
* add ChemBench
* update results
* molbench -> ChemBench
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-12 08:46:26 +08:00
Fengzhe Zhou
b39f501563
[Sync] update taco ( #1030 )
2024-04-09 17:50:23 +08:00
Jingming
89a8a8917b
[Feature] Add the implement of QuALITY datasets ( #976 )
...
#976
2024-03-15 21:22:38 +08:00
yuantao2108
bbec7d8733
[Feature] add lveval benchmark ( #914 )
...
* add lveval benchmark
* add LVEval readme file
* update LVEval readme file
* Update configs/eval_bluelm_32k_lveval.py
* Update configs/eval_llama2_7b_lveval.py
---------
Co-authored-by: yuantao <yuantao@infini-ai.com>
Co-authored-by: Mo Li <82895469+DseidLi@users.noreply.github.com>
2024-03-04 11:22:03 +08:00
Skyfall-xzz
4c45a71bbc
[Feature] Support OpenFinData ( #896 )
...
* [Feature] Support OpenFinData
* add README for OpenFinData
* update README
2024-02-29 12:55:07 +08:00
bittersweet1999
45c606bcd0
[Fix] Fix IFEval ( #906 )
...
* fix ifeval
* fix ifeval
* fix ifeval
* fix ifeval
2024-02-22 16:51:34 +08:00
Fengzhe Zhou
d34ba11106
[Sync] Merge branch 'dev' into zfz/update-keyset-demo ( #876 )
2024-02-05 23:29:10 +08:00
Skyfall-xzz
7ad1168062
Support NPHardEval ( #835 )
...
* support NPHardEval
* add .md file and fix minor bugs
* refactor and minor fix
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-05 15:52:28 +08:00
Fengzhe Zhou
0991dd33a0
[Sync] Updata dataset cfg for internMath ( #837 )
...
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-01-24 16:30:32 +08:00
Jingming
e059a5c2bf
[Feature] Add IFEval ( #813 )
...
* [Feature] Add IFEval
* [Doc] add introduction of IFEval
2024-01-23 20:07:49 +08:00
bittersweet1999
814b3f73bd
reorganize subject files ( #801 )
2024-01-16 18:03:11 +08:00
Fengzhe Zhou
32f40a8f83
[Sync] Sync with internal codes 2023.01.08 ( #777 )
2024-01-08 14:07:24 +00:00
bittersweet1999
2163f9398f
[Feature] add subject ir dataset ( #755 )
...
* add subject ir
* Add ir dataset
* Add ir dataset
2024-01-05 12:00:57 +00:00
bittersweet1999
be369c3e06
[Feature] Add multi_round dataset evaluation ( #766 )
...
* multi_round dataset
* add multi_round evaluation
2024-01-04 10:37:52 +00:00
Francis-llgg
b69fe2343b
[Feature] Add GPQA Dataset ( #729 )
...
* check
* message
* add
* change prompt
* change a para nameq
* modify name of the file
* delete an useless file
2024-01-01 15:54:40 +08:00
Francis-llgg
ef3ae63539
[Feature] Add new dataset mastermath2024v1 ( #744 )
...
* add new dataset mastermath2024v1
* change it to simplified chinese prompt
* change file name
2024-01-01 15:53:24 +08:00
bittersweet1999
fe0b717033
add creationbench ( #753 )
2023-12-29 10:03:44 +00:00
philipwangOvO
34561ececb
[Feature] Add InfiniteBench ( #739 )
...
* add InfiniteBench
* add InfiniteBench
---------
Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>
2023-12-26 15:36:27 +08:00
Fengzhe Zhou
3a68083ecc
[Sync] update configs ( #734 )
2023-12-25 21:59:16 +08:00
Skyfall-xzz
b35d991786
[Feature] Add ReasonBench(Internal) dataset ( #577 )
...
* [Feature] Add reasonbench dataset
* add configs for supporting generative inference & merge datasets in the same category
* modify config filename to prompt version
* fix codes to meet pre-commit requirements
* lint the code to meet pre-commit requirements
* Align Load_data Sourcecode Briefly
* fix bugs
* reduce code redundancy
2023-12-20 17:57:42 +08:00
bittersweet1999
1fe152b3e8
[Feature] Support AlignmentBench infer and judge ( #697 )
...
* alignmentbench infer and judge
* alignmentbench
* alignmentbench done
* alignment all done
* alignment all done
2023-12-13 19:59:30 +08:00
bittersweet1999
465308e430
[Feature] Add Subjective Evaluation ( #680 )
...
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
2023-12-11 22:22:11 +08:00
Xiaoming Shi
1bf85949ef
[Feature] Add medbench ( #678 )
...
* update medbench
* medbench update
* format medbench
* format
---------
Co-authored-by: 施晓明 <PJLAB\shixiaoming@pjnl104220118l.pjlab.org>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-09 16:05:46 +08:00
bittersweet1999
1c95790fdd
New subjective judgement ( #660 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* modified to a more general way
* modified to a more general way
* final
* final
* add summarizer
* add new summarize
* fixed
* fixed
* fixed
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-12-06 13:28:33 +08:00
liushz
a331c9abfd
[Feature] Add wikibench dataset ( #655 )
...
* Add WikiBench
* Add WikiBench
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-01 14:56:54 +08:00
liushz
e019c831fe
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq ( #144 )
...
* add Chinese version: csqa crowspairs nq
* Update cn_data
* Update cn_data
* update format
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-30 15:33:02 +08:00
liushz
6d0d78986c
[Feature] Add GSM_Hard dataset ( #619 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
* Add gsm_hard dataset
* Add gsm_hard dataset
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-27 17:40:34 +08:00
Fengzhe Zhou
d949e3c003
[Feature] Add circular eval ( #610 )
...
* refactor default, add circular summarizer
* add circular
* update impl
* update doc
* minor update
* no more to be added
2023-11-23 16:45:47 +08:00
liushz
048775192b
[Feature] Add SVAMP dataset ( #604 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
2023-11-22 14:54:39 +08:00
Raymond Zhang
c0acd06b05
[Feature] Add FinanceIQ dataset ( #596 )
2023-11-16 17:47:57 +08:00
Wei Jueqi
14e6fe6f13
Fix bugs in subjective evaluation ( #589 )
...
* rename
* fix sub bugs and update docs
* update
* update
2023-11-14 16:11:55 +08:00
jingmingzhuo
b3cbef3226
[Feature] Add py150 and maxmin ( #562 )
...
* [feat] add clozeTesst_maxmin dataset
* [feat] add py150 datasets
* [feat] change __init__.py in opencompass/datasets
* [fix] pre-commit check
* [fix] rename py150 and masxmin datasets in configs
* [feat] add gen.py of py150 and maxmin in configs/datasets
2023-11-09 22:05:25 +08:00
Hubert
bb2ecf416e
[Feat] Support cibench ( #538 )
...
* [Feat] support cidataset
* [Feat] support cidataset
* [Feat] support cidataset
* [Feat] support cidataset
* minor fix
* minor fix
* minor fix
* minor fix
* minor fix
* minor fix
* rename cibench
* rename cibench
* rename cibench
* rename cibench
* minor fix
* minor fix
* minor fix
2023-11-07 19:11:44 +08:00
bittersweet1999
f25a980043
[fFeat] Add an opensource dataset Tabmwp ( #505 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-11-03 11:15:46 +08:00
liushz
2737249f31
[Feature] Add mathbench dataset and circular evaluator ( #408 )
...
* add_mathbench
* update mathbench
* support non circular eval dataset
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-10-18 04:08:31 -05:00
Leymore
861942ab1b
[Feature] Add lawbench ( #460 )
...
* add lawbench
* update requirements
* update
2023-10-13 06:51:36 -05:00