OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Peng Bo	07c96ac659	Calm dataset (#1385 ) * Add CALM Dataset	2024-08-01 10:03:21 +08:00
Songyang Zhang	704853e5e7	[Feature] Update pip install (#1324 ) * [Feature] Update pip install * Update Configuration * Update * Update * Update * Update Internal Config * Update collect env	2024-07-29 18:32:50 +08:00
Xingjun.Wang	edab1c07ba	[Feature] Support ModelScope datasets (#1289 ) * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * udpate dataset for modelscope support * update readme * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * update readme * remove tydiqa japanese subset * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * update readme * udpate dataset for modelscope support * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * remove tydiqa japanese subset * update util * remove .DS_Store * fix md format * move util into package * update docs/get_started.md * restore eval_api_zhipu_v2.py, add environment setting * Update dataset * Update * Update * Update * Update --------- Co-authored-by: Yun lin <yunlin@U-Q9X2K4QV-1904.local> Co-authored-by: Yunnglin <mao.looper@qq.com> Co-authored-by: Yun lin <yunlin@laptop.local> Co-authored-by: Yunnglin <maoyl@smail.nju.edu.cn> Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>	2024-07-29 13:48:32 +08:00
bittersweet1999	d3782c1d47	Revert "Calm dataset (#1287 )" (#1366 ) This reverts commit `edd0ffdf70`.	2024-07-26 18:27:29 +08:00
Peng Bo	edd0ffdf70	Calm dataset (#1287 ) * add calm dataset * modify config max_out_len * update README * Modify README * update README * update README * update README * update README * update README * add summarizer and modify readme * delete summarizer config comment * update summarizer * modify same response to all questions * update README	2024-07-26 11:48:16 +08:00
Que Haoran	a244453d9e	[Feature] Support inference ppl datasets (#1315 ) * commit inference ppl datasets * revised format * revise * revise * revise * revise * revise * revise	2024-07-22 17:59:30 +08:00
Fengzhe Zhou	a32f21a356	[Sync] Sync with internal codes 2024.06.28 (#1279 )	2024-06-28 14:16:34 +08:00
jxd	608ff5810d	support CHARM (https://github.com/opendatalab/CHARM ) reasoning tasks (#1190 ) * support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks * fix lint error * add dataset card for CHARM * minor refactor * add txt --------- Co-authored-by: wujiang <wujiang@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-27 13:48:22 +08:00
bittersweet1999	826d8307ac	fix links (#1120 )	2024-05-08 15:13:18 +08:00
JuhaoLiang	d2c40e5648	[Feature] Add AceGPT-MMLUArabic benchmark (#1099 ) * add AceGPT-MMLUArabic benchmark * update readme and fix lint issue * remove unused package * add MMLUArabic zero-shot settings * rename filename and update readme	2024-05-08 15:00:26 +08:00
Yggdrasill7D6	af10ecc272	add mgsm datasets (#1081 ) * add mgsm datasets * fix lint * fix lint * update mgsm * update mgsm * ease code spell * update * update * update --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-06 15:29:34 +08:00
klein	153c4fc988	[Feature] update drop dataset from openai simple eval (#1092 ) * [Feature] update drop dataset from openai simple eval * update drop template presentation * update --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-06 13:37:08 +08:00
Alexander Lam	35c94d0cde	[Feature] Adding support for LLM Compression Evaluation (#1108 ) * fixed formatting based on pre-commit tests * fixed typo in comments; reduced the number of models in the eval config * fixed a bug in LLMCompressionDataset, where setting samples=None would result in passing test[:None] to load_dataset * removed unnecessary variable in _format_table_pivot; changed lark_reporter message to English	2024-04-30 10:51:01 +08:00
Yggdrasill7D6	58a57a4c45	[Feature] add support for Flames datasets (#1093 ) * add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by: bittersweet1999 <1487910649@qq.com>	2024-04-28 18:56:24 +08:00
liuwei130	a00e57296f	[Feature] Add ChemBench (#1032 ) * add ChemBench * update results * molbench -> ChemBench --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-04-12 08:46:26 +08:00
Fengzhe Zhou	b39f501563	[Sync] update taco (#1030 )	2024-04-09 17:50:23 +08:00
Jingming	89a8a8917b	[Feature] Add the implement of QuALITY datasets (#976 ) #976	2024-03-15 21:22:38 +08:00
yuantao2108	bbec7d8733	[Feature] add lveval benchmark (#914 ) * add lveval benchmark * add LVEval readme file * update LVEval readme file * Update configs/eval_bluelm_32k_lveval.py * Update configs/eval_llama2_7b_lveval.py --------- Co-authored-by: yuantao <yuantao@infini-ai.com> Co-authored-by: Mo Li <82895469+DseidLi@users.noreply.github.com>	2024-03-04 11:22:03 +08:00
Skyfall-xzz	4c45a71bbc	[Feature] Support OpenFinData (#896 ) * [Feature] Support OpenFinData * add README for OpenFinData * update README	2024-02-29 12:55:07 +08:00
bittersweet1999	45c606bcd0	[Fix] Fix IFEval (#906 ) * fix ifeval * fix ifeval * fix ifeval * fix ifeval	2024-02-22 16:51:34 +08:00
Fengzhe Zhou	d34ba11106	[Sync] Merge branch 'dev' into zfz/update-keyset-demo (#876 )	2024-02-05 23:29:10 +08:00
Skyfall-xzz	7ad1168062	Support NPHardEval (#835 ) * support NPHardEval * add .md file and fix minor bugs * refactor and minor fix --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-02-05 15:52:28 +08:00
Fengzhe Zhou	0991dd33a0	[Sync] Updata dataset cfg for internMath (#837 ) Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-01-24 16:30:32 +08:00
Jingming	e059a5c2bf	[Feature] Add IFEval (#813 ) * [Feature] Add IFEval * [Doc] add introduction of IFEval	2024-01-23 20:07:49 +08:00
bittersweet1999	814b3f73bd	reorganize subject files (#801 )	2024-01-16 18:03:11 +08:00
Fengzhe Zhou	32f40a8f83	[Sync] Sync with internal codes 2023.01.08 (#777 )	2024-01-08 14:07:24 +00:00
bittersweet1999	2163f9398f	[Feature] add subject ir dataset (#755 ) * add subject ir * Add ir dataset * Add ir dataset	2024-01-05 12:00:57 +00:00
bittersweet1999	be369c3e06	[Feature] Add multi_round dataset evaluation (#766 ) * multi_round dataset * add multi_round evaluation	2024-01-04 10:37:52 +00:00
Francis-llgg	b69fe2343b	[Feature] Add GPQA Dataset (#729 ) * check * message * add * change prompt * change a para nameq * modify name of the file * delete an useless file	2024-01-01 15:54:40 +08:00
Francis-llgg	ef3ae63539	[Feature] Add new dataset mastermath2024v1 (#744 ) * add new dataset mastermath2024v1 * change it to simplified chinese prompt * change file name	2024-01-01 15:53:24 +08:00
bittersweet1999	fe0b717033	add creationbench (#753 )	2023-12-29 10:03:44 +00:00
philipwangOvO	34561ececb	[Feature] Add InfiniteBench (#739 ) * add InfiniteBench * add InfiniteBench --------- Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>	2023-12-26 15:36:27 +08:00
Fengzhe Zhou	3a68083ecc	[Sync] update configs (#734 )	2023-12-25 21:59:16 +08:00
Skyfall-xzz	b35d991786	[Feature] Add ReasonBench(Internal) dataset (#577 ) * [Feature] Add reasonbench dataset * add configs for supporting generative inference & merge datasets in the same category * modify config filename to prompt version * fix codes to meet pre-commit requirements * lint the code to meet pre-commit requirements * Align Load_data Sourcecode Briefly * fix bugs * reduce code redundancy	2023-12-20 17:57:42 +08:00
bittersweet1999	1fe152b3e8	[Feature] Support AlignmentBench infer and judge (#697 ) * alignmentbench infer and judge * alignmentbench * alignmentbench done * alignment all done * alignment all done	2023-12-13 19:59:30 +08:00
bittersweet1999	465308e430	[Feature] Add Subjective Evaluation (#680 ) * new version of subject * fixed draw * fixed draw * fixed draw * done * done * done * done * fixed lint	2023-12-11 22:22:11 +08:00
Xiaoming Shi	1bf85949ef	[Feature] Add medbench (#678 ) * update medbench * medbench update * format medbench * format --------- Co-authored-by: 施晓明 <PJLAB\shixiaoming@pjnl104220118l.pjlab.org> Co-authored-by: Leymore <zfz-960727@163.com>	2023-12-09 16:05:46 +08:00
bittersweet1999	1c95790fdd	New subjective judgement (#660 ) * TabMWP * TabMWP * fixed * fixed * fixed * done * done * done * add new subjective judgement * add new subjective judgement * add new subjective judgement * add new subjective judgement * add new subjective judgement * modified to a more general way * modified to a more general way * final * final * add summarizer * add new summarize * fixed * fixed * fixed --------- Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>	2023-12-06 13:28:33 +08:00
liushz	a331c9abfd	[Feature] Add wikibench dataset (#655 ) * Add WikiBench * Add WikiBench * format --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-12-01 14:56:54 +08:00
liushz	e019c831fe	[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144 ) * add Chinese version: csqa crowspairs nq * Update cn_data * Update cn_data * update format --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2023-11-30 15:33:02 +08:00
liushz	6d0d78986c	[Feature] Add GSM_Hard dataset (#619 ) * Add SVAMP dataset * Add SVAMP dataset * Add SVAMP dataset * Add gsm_hard dataset * Add gsm_hard dataset * format --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-11-27 17:40:34 +08:00
Fengzhe Zhou	d949e3c003	[Feature] Add circular eval (#610 ) * refactor default, add circular summarizer * add circular * update impl * update doc * minor update * no more to be added	2023-11-23 16:45:47 +08:00
liushz	048775192b	[Feature] Add SVAMP dataset (#604 ) * Add SVAMP dataset * Add SVAMP dataset * Add SVAMP dataset	2023-11-22 14:54:39 +08:00
Raymond Zhang	c0acd06b05	[Feature] Add FinanceIQ dataset (#596 )	2023-11-16 17:47:57 +08:00
Wei Jueqi	14e6fe6f13	Fix bugs in subjective evaluation (#589 ) * rename * fix sub bugs and update docs * update * update	2023-11-14 16:11:55 +08:00
jingmingzhuo	b3cbef3226	[Feature] Add py150 and maxmin (#562 ) * [feat] add clozeTesst_maxmin dataset * [feat] add py150 datasets * [feat] change __init__.py in opencompass/datasets * [fix] pre-commit check * [fix] rename py150 and masxmin datasets in configs * [feat] add gen.py of py150 and maxmin in configs/datasets	2023-11-09 22:05:25 +08:00
Hubert	bb2ecf416e	[Feat] Support cibench (#538 ) * [Feat] support cidataset * [Feat] support cidataset * [Feat] support cidataset * [Feat] support cidataset * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * rename cibench * rename cibench * rename cibench * rename cibench * minor fix * minor fix * minor fix	2023-11-07 19:11:44 +08:00
bittersweet1999	f25a980043	[fFeat] Add an opensource dataset Tabmwp (#505 ) * TabMWP * TabMWP * fixed * fixed * fixed * done * done * done --------- Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>	2023-11-03 11:15:46 +08:00
liushz	2737249f31	[Feature] Add mathbench dataset and circular evaluator (#408 ) * add_mathbench * update mathbench * support non circular eval dataset --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: yingfhu <yingfhu@gmail.com>	2023-10-18 04:08:31 -05:00
Leymore	861942ab1b	[Feature] Add lawbench (#460 ) * add lawbench * update requirements * update	2023-10-13 06:51:36 -05:00

1 2

67 Commits