OpenCompass/opencompass/configs/datasets
Linchen Xiao b2da1c08a8
[Dataset] Add SmolInstruct, Update Chembench (#2025)
* [Dataset] Add SmolInstruct, Update Chembench

* Add dataset metadata

* update

* update

* update
2025-04-18 17:21:29 +08:00
..
adv_glue [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
agieval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
aime2024 [Bug] Aime2024 config fix (#1974) 2025-03-25 17:57:11 +08:00
aime2025 [Update] Add configurations for llmjudge dataset (#1940) 2025-03-13 17:30:04 +08:00
anli [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
anthropics_evals [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
apps [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
ARC_c [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) 2024-09-13 10:30:43 +08:00
ARC_e [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
ARC_Prize_Public_Evaluation [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
babilong [Feature] BABILong Dataset added (#1684) 2024-11-14 15:32:43 +08:00
bbeh [Update] Add configurations for llmjudge dataset (#1940) 2025-03-13 17:30:04 +08:00
bbh [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
bigcodebench [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
calm [Feature] Support OpenAI ChatCompletion (#1389) 2024-08-01 19:10:13 +08:00
ceval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CHARM [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
ChemBench [Dataset] Add SmolInstruct, Update Chembench (#2025) 2025-04-18 17:21:29 +08:00
chinese_simpleqa [Refactor] Code refactoarization (#1831) 2025-01-20 19:17:38 +08:00
CIBench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
civilcomments [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
ClimaQA [Feature] Add Datasets: ClimateQA,Physics (#2017) 2025-04-14 20:18:47 +08:00
clozeTest_maxmin [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_afqmc [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_C3 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_cmnli [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_CMRC [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_DRCD [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
CLUE_ocnli [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
cmb [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
cmmlu [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
cmo_fib [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
collections [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
commonsenseqa [Bug] Commonsenseqa dataset fix (#1425) 2024-08-16 15:54:07 +08:00
commonsenseqa_cn [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
compassbench_20_v1_1 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
compassbench_20_v1_1_public [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
compassbench_v1_3 [Update] Compassbench v1.3 (#1396) 2024-08-12 19:09:19 +08:00
contamination [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
crowspairs [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
crowspairs_cn [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
cvalues [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
demo [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
dingo [Feature] Add dingo test (#1529) 2024-09-29 19:24:58 +08:00
drop [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
ds1000 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_bustm [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_chid [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_cluewsc [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_csl [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_eprstmt [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_ocnli_fc [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FewCLUE_tnews [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
FinanceIQ [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
flores [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
game24 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
gaokao_math [Feature] Add GaoKaoMath Dataset for Evaluation & MATH Model Eval Config (#1589) 2024-10-12 19:13:06 +08:00
GaokaoBench [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
GLUE_CoLA [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
GLUE_MRPC [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
GLUE_QQP [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
govrepcrs [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
gpqa [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
gsm8k [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
gsm8k_contamination [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
gsm_hard [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
hellaswag [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
HLE [Feature] Add HLE (Humanity's Last Exam) dataset (#1902) 2025-03-04 16:42:37 +08:00
humaneval [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
humaneval_cn [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
humaneval_multi [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
humaneval_plus [Update] Update Fullbench (#1712) 2024-11-26 14:26:55 +08:00
humanevalx [Update] Update dataset configuration with no max_out_len (#1754) 2024-12-11 18:20:29 +08:00
hungarian_exam [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
IFEval [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
inference_ppl [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
infinitebench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
iwslt2017 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
jigsawmultilingual [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
kaoshi [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
korbench [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
lambada [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
lawbench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
LCBench [Update] Update Skywork/Qwen-QwQ (#1728) 2024-12-05 19:30:43 +08:00
lcsts [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
leval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
livecodebench [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
livemathbench [Dataset] Add SmolInstruct, Update Chembench (#2025) 2025-04-18 17:21:29 +08:00
livereasonbench [Refactor] Code refactoarization (#1831) 2025-01-20 19:17:38 +08:00
livestembench [Refactor] Code refactoarization (#1831) 2025-01-20 19:17:38 +08:00
llm_compression [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
longbench [Feature] Longbench dataset update 2024-09-06 15:50:12 +08:00
longbenchv2 [Feature] Add Longbenchv2 support (#1801) 2025-01-03 12:04:29 +08:00
lveval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
mastermath2024v1 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
math [Fix] OpenICL Math Evaluator Config (#2007) 2025-04-08 14:38:35 +08:00
math401 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
MathBench [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
mbpp [Update] Update Fullbench (#1712) 2024-11-26 14:26:55 +08:00
mbpp_cn [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
mbpp_plus [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
MedBench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
MedXpertQA [Dataset] Add MedXpertQA (#2002) 2025-04-08 10:44:48 +08:00
mgsm [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
mmlu [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
mmlu_cf [Feature] Support MMLU-CF Benchmark (#1775) 2025-01-09 14:11:20 +08:00
mmlu_pro [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
MMLUArabic [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
mmmlu [Refactor] Code refactoarization (#1831) 2025-01-20 19:17:38 +08:00
mmmlu_lite [Update] Update mmmlu_lite dataload (#1658) 2024-11-01 17:32:29 +08:00
multipl_e [Feature] Add MultiPL-E & Code Evaluator (#1963) 2025-03-21 20:09:25 +08:00
musr [Feature] Add recommendation configs for datasets (#1937) 2025-03-25 14:54:13 +08:00
narrativeqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
needlebench [Feature] Add long context evaluation for base models (#1666) 2024-11-08 10:53:29 +08:00
NPHardEval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
nq [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
nq_cn [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
obqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
OlymMATH [Feature] Add olymmath dataset (#1982) 2025-04-02 17:34:07 +08:00
OlympiadBench [Update] Support OlympiadBench-Math/OmniMath/LiveMathBench-Hard (#1899) 2025-03-03 18:56:11 +08:00
omni_math [Update] Support OlympiadBench-Math/OmniMath/LiveMathBench-Hard (#1899) 2025-03-03 18:56:11 +08:00
OpenFinData [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
PHYSICS [Feature] Add Datasets: ClimateQA,Physics (#2017) 2025-04-14 20:18:47 +08:00
piqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
PJExam [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
PMMEval [Refactor] Code refactoarization (#1831) 2025-01-20 19:17:38 +08:00
promptbench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
py150 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
qabench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
qasper [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
qaspercut [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
QuALITY [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
race [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) 2024-09-13 10:30:43 +08:00
realtoxicprompts [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
rolebench [Feature] Add abbr for rolebench dataset (#1431) 2024-08-20 11:22:48 +08:00
ruler [Update] Customizable tokenizer for RULER (#1731) 2024-12-19 18:02:11 +08:00
s3eval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
safety [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
scibench [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
scicode [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
SimpleQA [Feature] Add Openai Simpleqa dataset (#1720) 2024-11-28 19:16:07 +08:00
siqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SmolInstruct [Dataset] Add SmolInstruct, Update Chembench (#2025) 2025-04-18 17:21:29 +08:00
squad20 [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
storycloze [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
strategyqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
subjective [Feature] Support subjective evaluation for reasoning model (#1868) 2025-02-20 12:19:46 +08:00
summedits [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
summscreen [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_AX_b [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_AX_g [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_BoolQ [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) 2024-09-13 10:30:43 +08:00
SuperGLUE_CB [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_COPA [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_MultiRC [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_ReCoRD [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_RTE [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_WiC [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
SuperGLUE_WSC [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
supergpqa [Update] Add SuperGPQA subset metrics (#1966) 2025-03-24 14:25:12 +08:00
SVAMP [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
TabMWP [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
taco [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
teval [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
TheoremQA [Update] Add 0shot CoT config for TheoremQA (#1783) 2024-12-27 16:17:27 +08:00
triviaqa [Update] Add dataset configurations of no max_out_len (#1967) 2025-03-24 14:24:12 +08:00
triviaqarc [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
truthfulqa [Fix] Update SciCode and Gemma model (#1449) 2024-08-23 10:42:27 +08:00
tydiqa [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
wikibench [Fix] the automatically download for several datasets (#1652) 2024-11-01 15:57:18 +08:00
wikitext [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
winograd [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
winogrande [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
XCOPA [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
xiezhi [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
XLSum [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
Xsum [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00