.. |
adv_glue
|
[Feat] support adv_glue dataset for adversarial robustness (#205)
|
2023-08-16 18:42:06 +08:00 |
agieval
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
anli
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
anthropics_evals
|
[Feat] support antropics evals dataset (#422)
|
2023-09-20 18:36:44 +08:00 |
apps
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
ARC_c
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
ARC_e
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
bbh
|
[Sync] update (#517)
|
2023-10-27 20:31:22 +08:00 |
cdme
|
[Update] Change NeedleInAHaystackDataset to dynamic dataset loading (#754)
|
2024-01-02 17:22:56 +08:00 |
ceval
|
[Feature] Add Data Contamination Analysis (#639)
|
2023-12-08 10:00:11 +08:00 |
CIBench
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
civilcomments
|
[Feat] add safety to collections (#185)
|
2023-08-11 11:19:26 +08:00 |
clozeTest_maxmin
|
[Feature] Add py150 and maxmin (#562)
|
2023-11-09 22:05:25 +08:00 |
CLUE_afqmc
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
CLUE_C3
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
CLUE_cmnli
|
[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes (#625)
|
2023-11-23 14:05:59 +08:00 |
CLUE_CMRC
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
CLUE_DRCD
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
CLUE_ocnli
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
cmb
|
[Feature] Update cmb (#571)
|
2023-11-13 00:09:05 +08:00 |
cmmlu
|
[Refactor] Move fix_id_list to Retriever (#442)
|
2023-10-07 12:53:41 +08:00 |
collections
|
[Fix] fix typos in drop prompt (#773)
|
2024-01-08 14:22:35 +08:00 |
commonsenseqa
|
[Sync] some renaming (#641)
|
2023-11-27 16:06:49 +08:00 |
commonsenseqa_cn
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
crowspairs
|
[Feature] Add LEval datasets
|
2023-08-11 17:38:31 +08:00 |
crowspairs_cn
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
cvalues
|
[Feature] Add LEval datasets
|
2023-08-11 17:38:31 +08:00 |
drop
|
[Fix] fix typos in drop prompt (#773)
|
2024-01-08 14:22:35 +08:00 |
ds1000
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
FewCLUE_bustm
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FewCLUE_chid
|
[Feature] Add logger info and remove dataset bugs (#61)
|
2023-07-17 14:26:30 +08:00 |
FewCLUE_cluewsc
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FewCLUE_csl
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FewCLUE_eprstmt
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FewCLUE_ocnli_fc
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FewCLUE_tnews
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
FinanceIQ
|
[Feature] Add FinanceIQ dataset (#596)
|
2023-11-16 17:47:57 +08:00 |
flores
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
game24
|
[Feature] Add and apply update suffix tool (#280)
|
2023-08-28 17:35:04 +08:00 |
GaokaoBench
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
GLUE_CoLA
|
[Docs] fix dataset name error (#533)
|
2023-11-10 18:54:20 +08:00 |
GLUE_MRPC
|
[Sync] update model configs (#574)
|
2023-11-13 15:15:34 +08:00 |
GLUE_QQP
|
[Refactor] Move fix_id_list to Retriever (#442)
|
2023-10-07 12:53:41 +08:00 |
govrepcrs
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
gpqa
|
[Feature] Add GPQA Dataset (#729)
|
2024-01-01 15:54:40 +08:00 |
gsm8k
|
[Feat] Update math/agent (#716)
|
2023-12-19 21:20:42 +08:00 |
gsm8k_contamination
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
gsm_hard
|
[Feature] Add GSM_Hard dataset (#619)
|
2023-11-27 17:40:34 +08:00 |
hellaswag
|
fix hellaswag_ppl_47bff9 (#648)
|
2023-11-29 16:51:44 +08:00 |
humaneval
|
[Feat] update code config (#749)
|
2023-12-29 18:46:34 +08:00 |
humaneval_cn
|
[Feat] update code config (#749)
|
2023-12-29 18:46:34 +08:00 |
humaneval_plus
|
[Feat] update code config (#749)
|
2023-12-29 18:46:34 +08:00 |
humanevalx
|
[Docs] add humanevalx dataset link in config (#559)
|
2023-11-10 18:18:58 +08:00 |
infinitebench
|
[Feature] Add InfiniteBench (#739)
|
2023-12-26 15:36:27 +08:00 |
iwslt2017
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
jigsawmultilingual
|
[Feat] add safety to collections (#185)
|
2023-08-11 11:19:26 +08:00 |
kaoshi
|
[Feature] Add kaoshi dataset (#392)
|
2023-09-22 18:46:33 +08:00 |
lambada
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
lawbench
|
[Feature] Add lawbench (#460)
|
2023-10-13 06:51:36 -05:00 |
lcsts
|
[Fix] Use jieba rouge in lcsts (#459)
|
2023-10-09 10:10:33 +08:00 |
leval
|
[Sync] Update LongEval (#443)
|
2023-09-27 16:32:40 +08:00 |
longbench
|
[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes (#625)
|
2023-11-23 14:05:59 +08:00 |
mastermath2024v1
|
[Feature] Add new dataset mastermath2024v1 (#744)
|
2024-01-01 15:53:24 +08:00 |
math
|
[Feat] Update math/agent (#716)
|
2023-12-19 21:20:42 +08:00 |
MathBench
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
mbpp
|
[Feat] update code config (#749)
|
2023-12-29 18:46:34 +08:00 |
mbpp_cn
|
[Feat] update code config (#749)
|
2023-12-29 18:46:34 +08:00 |
mbpp_plus
|
Support Mbpp_plus dataset (#770)
|
2024-01-05 22:01:57 +08:00 |
MedBench
|
[Feature] Add medbench (#678)
|
2023-12-09 16:05:46 +08:00 |
mmlu
|
Update LightllmApi and Fix mmlu bug (#738)
|
2023-12-27 13:49:08 +08:00 |
narrativeqa
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
nq
|
[Refactor] Move fix_id_list to Retriever (#442)
|
2023-10-07 12:53:41 +08:00 |
nq_cn
|
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq (#144)
|
2023-11-30 15:33:02 +08:00 |
obqa
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
piqa
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
PJExam
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
promptbench
|
[Feat] implementation for support promptbench (#239)
|
2023-09-15 15:06:53 +08:00 |
py150
|
[Feature] Add py150 and maxmin (#562)
|
2023-11-09 22:05:25 +08:00 |
qabench
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
qasper
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
qaspercut
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
race
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
realtoxicprompts
|
[Feat] add safety to collections (#185)
|
2023-08-11 11:19:26 +08:00 |
ReasonBench
|
[Feature] Add ReasonBench(Internal) dataset (#577)
|
2023-12-20 17:57:42 +08:00 |
rolebench
|
added rolebench dataset. (#633)
|
2023-12-01 22:54:42 +08:00 |
safety
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
scibench
|
add evaluation of scibench (#393)
|
2023-09-22 17:42:08 +08:00 |
siqa
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
squad20
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
storycloze
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
strategyqa
|
[Feature] Use dataset in local path (#570)
|
2023-11-13 13:00:37 +08:00 |
subjective_alignbench
|
fix erro in configs (#750)
|
2023-12-28 11:53:07 +00:00 |
subjective_cmp
|
[Feature] Add other judgelm prompts for Alignbench (#731)
|
2023-12-27 17:54:53 +08:00 |
subjective_creationbench
|
add creationbench (#753)
|
2023-12-29 10:03:44 +00:00 |
subjective_ir
|
[Feature] add subject ir dataset (#755)
|
2024-01-05 12:00:57 +00:00 |
subjective_multiround
|
[Feature] Add multi_round dataset evaluation (#766)
|
2024-01-04 10:37:52 +00:00 |
summedits
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
summscreen
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
SuperGLUE_AX_b
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_AX_g
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_BoolQ
|
[Feature] add llama-oriented dataset configs (#82)
|
2023-08-11 12:48:05 +08:00 |
SuperGLUE_CB
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_COPA
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_MultiRC
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_ReCoRD
|
[Feature] Add qwen & qwen-chat support (#286)
|
2023-08-31 11:29:05 +08:00 |
SuperGLUE_RTE
|
[Feat] update postprocessor to get first option more accurately (#193)
|
2023-08-11 17:33:00 +08:00 |
SuperGLUE_WiC
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
SuperGLUE_WSC
|
[Fix] Fix typo in WSC prompt (#520)
|
2023-10-30 12:16:26 +08:00 |
SVAMP
|
[Feature] Add SVAMP dataset (#604)
|
2023-11-22 14:54:39 +08:00 |
TabMWP
|
[fFeat] Add an opensource dataset Tabmwp (#505)
|
2023-11-03 11:15:46 +08:00 |
TheoremQA
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
triviaqa
|
[Refactor] Move fix_id_list to Retriever (#442)
|
2023-10-07 12:53:41 +08:00 |
triviaqarc
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
truthfulqa
|
[Feat] refine docs and codes for more user guides (#409)
|
2023-09-18 16:12:13 +08:00 |
tydiqa
|
update word spell (#594)
|
2023-11-15 15:23:58 +08:00 |
wikibench
|
[Feature] Add wikibench dataset (#655)
|
2023-12-01 14:56:54 +08:00 |
wikitext
|
[SIG] add WikiText-2&103 (#397)
|
2023-09-26 14:31:15 +08:00 |
winograd
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
winogrande
|
[Sync] minor test (#683)
|
2023-12-11 17:42:53 +08:00 |
XCOPA
|
Align prompt files with their hash (#1)
|
2023-07-05 18:28:58 +08:00 |
xiezhi
|
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
|
2023-08-10 14:04:18 +08:00 |
XLSum
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
Xsum
|
Update configs (#9)
|
2023-07-06 12:27:41 +08:00 |
z_bench
|
[Feature] Add and apply update suffix tool (#280)
|
2023-08-28 17:35:04 +08:00 |