.. |
agieval
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
calm
|
Calm dataset (#1287)
|
2024-07-26 11:48:16 +08:00 |
IFEval
|
解决release版本安装后不能导入opencompass.cli.main的问题 (#1221)
|
2024-05-31 13:23:33 +08:00 |
infinitebench
|
[Feature] Add InfiniteBench (#739)
|
2023-12-26 15:36:27 +08:00 |
lawbench
|
解决release版本安装后不能导入opencompass.cli.main的问题 (#1221)
|
2024-05-31 13:23:33 +08:00 |
leval
|
[Sync] Update LongEval (#443)
|
2023-09-27 16:32:40 +08:00 |
longbench
|
[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes (#625)
|
2023-11-23 14:05:59 +08:00 |
lveval
|
[Feature] add lveval benchmark (#914)
|
2024-03-04 11:22:03 +08:00 |
medbench
|
[Fix] Update MedBench (#845)
|
2024-01-26 17:56:13 +08:00 |
needlebench
|
[Feature] Make NeedleBench available on HF (#1364)
|
2024-07-25 19:01:56 +08:00 |
NPHardEval
|
[Sync] update taco (#1030)
|
2024-04-09 17:50:23 +08:00 |
reasonbench
|
[Sync] Sync with internal codes 2023.01.08 (#777)
|
2024-01-08 14:07:24 +00:00 |
subjective
|
[Fix] minor update wildbench (#1335)
|
2024-07-26 11:19:04 +08:00 |
teval
|
[Sync] Merge branch 'dev' into zfz/update-keyset-demo (#876)
|
2024-02-05 23:29:10 +08:00 |
TheoremQA
|
[Feature] Add TheoremQA with 5-shot (#1048)
|
2024-04-22 15:22:04 +08:00 |
__init__.py
|
Calm dataset (#1287)
|
2024-07-26 11:48:16 +08:00 |
advglue.py
|
[Feat] support adv_glue dataset for adversarial robustness (#205)
|
2023-08-16 18:42:06 +08:00 |
afqmcd.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
anli.py
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
anthropics_evals.py
|
[Feat] support antropics evals dataset (#422)
|
2023-09-20 18:36:44 +08:00 |
apps.py
|
[Sync] deprecate old mbpps (#1064)
|
2024-04-19 20:49:46 +08:00 |
arc.py
|
[Feature] Contamination analysis for MMLU, Hellaswag, and ARC_c (#699)
|
2024-01-08 15:51:48 +08:00 |
ax.py
|
Add release contribution
|
2023-07-05 03:15:31 +00:00 |
base.py
|
Add release contribution
|
2023-07-05 03:15:31 +00:00 |
bbh.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
benbench.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
boolq.py
|
[Feature] add llama-oriented dataset configs (#82)
|
2023-08-11 12:48:05 +08:00 |
bustum.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
c3.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
cb.py
|
[Feat] support opencompass
|
2023-07-04 22:11:33 +08:00 |
ceval.py
|
[Feature] Add Data Contamination Analysis (#639)
|
2023-12-08 10:00:11 +08:00 |
charm.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
chembench.py
|
[Feature] Add ChemBench (#1032)
|
2024-04-12 08:46:26 +08:00 |
chid.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
cibench.py
|
Update CIBench (#1089)
|
2024-04-26 18:46:02 +08:00 |
circular.py
|
[Sync] Add InternLM2 Keyset Evaluation Demo (#807)
|
2024-01-17 13:48:12 +08:00 |
civilcomments.py
|
[Feat] support opencompass
|
2023-07-04 22:11:33 +08:00 |
clozeTest_maxmin.py
|
[Feature] Add py150 and maxmin (#562)
|
2023-11-09 22:05:25 +08:00 |
cluewsc.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
cmb.py
|
[Bug] Fix CMB dataset (#1106)
|
2024-04-30 00:33:43 +08:00 |
cmmlu.py
|
[Feature] Add CMMLU dataset (#91)
|
2023-07-25 10:14:27 +08:00 |
cmnli.py
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
cmrc.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
commonsenseqa_cn.py
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
commonsenseqa.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
compassbench_obj.py
|
[Feature] Add compassbench knowledge&math part (#1342)
|
2024-07-19 22:54:46 +08:00 |
copa.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
crowspairs_cn.py
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
crowspairs.py
|
update (#251)
|
2023-08-23 16:25:23 +08:00 |
csl.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
custom.py
|
[Sync] update taco (#1030)
|
2024-04-09 17:50:23 +08:00 |
cvalues.py
|
[Feat] Support CValues Responsibility dataset (#78)
|
2023-07-18 18:45:15 +08:00 |
drcd.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
drop_simple_eval.py
|
[Feature] update drop dataset from openai simple eval (#1092)
|
2024-05-06 13:37:08 +08:00 |
drop.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
ds1000_interpreter.py
|
[Feat] Support cibench (#538)
|
2023-11-07 19:11:44 +08:00 |
ds1000.py
|
[Sync] Add InternLM2 Keyset Evaluation Demo (#807)
|
2024-01-17 13:48:12 +08:00 |
eprstmt.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
FinanceIQ.py
|
[Feature] Add FinanceIQ dataset (#596)
|
2023-11-16 17:47:57 +08:00 |
flames.py
|
[Feature] add support for Flames datasets (#1093)
|
2024-04-28 18:56:24 +08:00 |
flores.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
game24.py
|
[Fix] use sympy only when necessary (#255)
|
2023-08-24 10:15:20 +08:00 |
GaokaoBench.py
|
[Sync] update evaluator (#1175)
|
2024-05-21 14:22:46 +08:00 |
govrepcrs.py
|
Add release contribution
|
2023-07-05 03:15:31 +00:00 |
gpqa.py
|
[Feature] Add gpqa prompt from simple_evals, openai (#1080)
|
2024-04-26 20:13:00 +08:00 |
gsm8k.py
|
fix bug of gsm8k_postprocess (#863)
|
2024-02-06 23:52:47 +08:00 |
gsm_hard.py
|
[Feature] Add GSM_Hard dataset (#619)
|
2023-11-27 17:40:34 +08:00 |
hellaswag.py
|
[Sync] Sync Internal (#941)
|
2024-03-04 14:42:36 +08:00 |
huggingface.py
|
[Feat] support opencompass
|
2023-07-04 22:11:33 +08:00 |
humaneval_multi.py
|
[feat] support multipl-e (#846)
|
2024-02-06 23:30:28 +08:00 |
humaneval.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
humanevalx.py
|
[Sync] update taco (#1030)
|
2024-04-09 17:50:23 +08:00 |
hungarian_math.py
|
[Sync] Sync with internal codes 2023.01.08 (#777)
|
2024-01-08 14:07:24 +00:00 |
inference_ppl.py
|
[Feature] Support inference ppl datasets (#1315)
|
2024-07-22 17:59:30 +08:00 |
iwslt2017.py
|
Add release contribution
|
2023-07-05 03:15:31 +00:00 |
jigsawmultilingual.py
|
initial commit
|
2023-07-04 21:34:55 +08:00 |
jsonl.py
|
[Sync] Sync with internal codes 2023.01.08 (#777)
|
2024-01-08 14:07:24 +00:00 |
kaoshi.py
|
[Feature] Add kaoshi dataset (#392)
|
2023-09-22 18:46:33 +08:00 |
lambada.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
lcsts.py
|
Support a batch of datasets.
|
2023-07-05 01:30:27 +00:00 |
llm_compression.py
|
[Feature] Adding support for LLM Compression Evaluation (#1108)
|
2024-04-30 10:51:01 +08:00 |
lmeval.py
|
[Sync] update github token (#475)
|
2023-10-13 06:50:54 -05:00 |
mastermath2024v1.py
|
[Feature] Add new dataset mastermath2024v1 (#744)
|
2024-01-01 15:53:24 +08:00 |
math401.py
|
[Sync] Sync with internal codes 2023.01.08 (#777)
|
2024-01-08 14:07:24 +00:00 |
math_intern.py
|
[Sync] Updata dataset cfg for internMath (#837)
|
2024-01-24 16:30:32 +08:00 |
math.py
|
[Feature] Support Math evaluation via judgemodel (#1094)
|
2024-04-26 14:56:23 +08:00 |
mathbench.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
mbpp.py
|
[Sync] bump version (#1204)
|
2024-05-28 23:09:59 +08:00 |
mgsm.py
|
add mgsm datasets (#1081)
|
2024-05-06 15:29:34 +08:00 |
mmlu_pro.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
mmlu.py
|
[Feature] Contamination analysis for MMLU, Hellaswag, and ARC_c (#699)
|
2024-01-08 15:51:48 +08:00 |
MMLUArabic.py
|
[Feature] Add AceGPT-MMLUArabic benchmark (#1099)
|
2024-05-08 15:00:26 +08:00 |
multirc.py
|
initial commit
|
2023-07-04 21:34:55 +08:00 |
narrativeqa.py
|
Add release contribution
|
2023-07-05 03:15:31 +00:00 |
natural_question_cn.py
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
natural_question.py
|
[Sync] Sync Internal (#941)
|
2024-03-04 14:42:36 +08:00 |
obqa.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
OpenFinData.py
|
[Feature] Support OpenFinData (#896)
|
2024-02-29 12:55:07 +08:00 |
piqa.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
py150.py
|
[Feature] Add py150 and maxmin (#562)
|
2023-11-09 22:05:25 +08:00 |
qasper.py
|
Add Release Contraibution
|
2023-07-05 02:22:40 +00:00 |
qaspercut.py
|
Add Release Contraibution
|
2023-07-05 02:22:40 +00:00 |
QuALITY.py
|
[Feature] Add the implement of QuALITY datasets (#976)
|
2024-03-15 21:22:38 +08:00 |
race.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
realtoxicprompts.py
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
record.py
|
[Feature] Add qwen & qwen-chat support (#286)
|
2023-08-31 11:29:05 +08:00 |
rolebench.py
|
added rolebench dataset. (#633)
|
2023-12-01 22:54:42 +08:00 |
s3eval.py
|
[Feature] Add S3Eval Dataset (#916)
|
2024-05-06 19:41:52 +08:00 |
safety.py
|
Add Release Contraibution
|
2023-07-05 02:22:40 +00:00 |
scibench.py
|
add evaluation of scibench (#393)
|
2023-09-22 17:42:08 +08:00 |
siqa.py
|
[Sync] Add InternLM2 Keyset Evaluation Demo (#807)
|
2024-01-17 13:48:12 +08:00 |
squad20.py
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
storycloze.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
strategyqa.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
summedits.py
|
[Enhancement] Test linting in CI and fix existing linting errors (#69)
|
2023-07-17 15:59:10 +08:00 |
summscreen.py
|
Support a batch of datasets.
|
2023-07-05 01:30:27 +00:00 |
svamp.py
|
[Feature] Add SVAMP dataset (#604)
|
2023-11-22 14:54:39 +08:00 |
tabmwp.py
|
[fFeat] Add an opensource dataset Tabmwp (#505)
|
2023-11-03 11:15:46 +08:00 |
taco.py
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
tnews.py
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
triviaqa.py
|
[Sync] Merge branch 'dev' into zfz/update-keyset-demo (#876)
|
2024-02-05 23:29:10 +08:00 |
triviaqarc.py
|
Add Release Contraibution
|
2023-07-05 02:22:40 +00:00 |
truthfulqa.py
|
[Feat] refine docs and codes for more user guides (#409)
|
2023-09-18 16:12:13 +08:00 |
tydiqa.py
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
wic.py
|
Support a batch of datasets.
|
2023-07-05 01:30:27 +00:00 |
wikibench.py
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
winograd.py
|
initial commit
|
2023-07-04 21:34:55 +08:00 |
winogrande.py
|
[Feature] Add huggingface apply_chat_template (#1098)
|
2024-05-14 14:50:16 +08:00 |
wnli.py
|
[Feat] implementation for support promptbench (#239)
|
2023-09-15 15:06:53 +08:00 |
wsc.py
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
xcopa.py
|
Add Release Contraibution
|
2023-07-05 02:22:40 +00:00 |
xiezhi.py
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
xlsum.py
|
update datasets
|
2023-07-05 01:45:26 +00:00 |
xsum.py
|
update datasets
|
2023-07-05 01:45:26 +00:00 |