Commit Graph

148 Commits

Author SHA1 Message Date
Fengzhe Zhou
dbb20b8270
[Sync] update (#517) 2023-10-27 20:31:22 +08:00
Wei Jueqi
b62842335d
[Doc] Update Subjective docs (#510)
* rename

* add en subdoc

* fix name

* fix writing

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-27 16:27:24 +08:00
Leymore
4dd9a3fc10
[Sync] sync with internal codes 20231019 (#488) 2023-10-18 23:37:35 -05:00
liushz
2737249f31
[Feature] Add mathbench dataset and circular evaluator (#408)
* add_mathbench

* update mathbench

* support non circular eval dataset

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-10-18 04:08:31 -05:00
Leymore
861942ab1b
[Feature] Add lawbench (#460)
* add lawbench

* update requirements

* update
2023-10-13 06:51:36 -05:00
Leymore
fbf5089c40
[Sync] update github token (#475) 2023-10-13 06:50:54 -05:00
Leymore
9db5652638
[Feature] re-implement ceval load dataset (#446) 2023-09-27 21:18:48 +08:00
philipwangOvO
3bb3d330eb
[Sync] Update LongEval (#443) 2023-09-27 16:32:40 +08:00
liushz
c5224c2a91
[Feature] Add kaoshi dataset (#392)
* Add ToT method

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Add Koashi

* Update Kaoshi

* Update Kaoshi

* Update kaoshi

* Update kaoshi

* Update Kaoshi

* Update Kaoshi

* Update Kaoshi

* Update Kaoshi

* update Kaoshi

* update

* update

* fix

---------
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-09-22 18:46:33 +08:00
TTTTTiam
2a62bea1a4
add evaluation of scibench (#393)
* add evaluation of scibench

* add evaluation of scibench

* update scibench

* remove scibench evaluator

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 17:42:08 +08:00
Tong Gao
a1ea3c094a
[Sync] Initial support of subjective evaluation (#421)
Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 15:42:31 +08:00
Hubert
8803f7f7a6
[Feat] support antropics evals dataset (#422)
* [Feat] support anthropics ai risk dataset

* [Feat] support anthropics evals dataset

* [Feat] support anthropics evals dataset
2023-09-20 18:36:44 +08:00
Hubert
2c15a0c01d
[Feat] refine docs and codes for more user guides (#409) 2023-09-18 16:12:13 +08:00
Hubert
a11cb45c83
[Feat] implementation for support promptbench (#239)
* [Feat] support adv_glue dataset for adversarial robustness

* reorg files

* minor fix

* minor fix

* support prompt bench demo

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix
2023-09-15 15:06:53 +08:00
Hubert
de8a154795
[Feat] support ds1000 dataset (#395)
* [Feat] support ds1000 datase
2023-09-15 12:50:27 +08:00
Xidong Wang
47a752cd56
[Dataset] Add CMB (#376)
* Add CMB

* modify CMB

---------

Co-authored-by: wangxidong <xidongw@163.com>
2023-09-12 19:16:41 +08:00
Hubert
ddb8197212
[Feat] support wizardcoder series (#344)
* [Feat] support wizardcoder series

* minor fix
2023-09-06 17:52:35 +08:00
Leymore
a1782f9a08
[Fix] triviaqa & nq postprocess (#350) 2023-09-04 15:24:52 +08:00
Leymore
7ca6ba625e
[Feature] Add qwen & qwen-chat support (#286)
* add and apply update suffix tool

* add tool doc

* add qwen configs

* add cmmlu

* rename bbh

* update datasets

* delete

* update hf_qwen_7b.py
2023-08-31 11:29:05 +08:00
Hubert
fd389e2d78
[Feat] support codellama and preds collection tools (#335) 2023-08-31 11:14:42 +08:00
philipwangOvO
3f37c40aa3
[Dataset] Refactor LEval 2023-08-25 11:46:23 +08:00
Tong Gao
bd47a00f27
[Fix] use sympy only when necessary (#255) 2023-08-24 10:15:20 +08:00
Tong Gao
01372a4806
update (#251) 2023-08-23 16:25:23 +08:00
liushz
02ce139bc6
[Feature] Add Tree-of-Thought method (#173)
* Add ToT method

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update chain_of_thought.md

* Update icl_tot_inferencer.py

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2023-08-23 12:23:05 +08:00
philipwangOvO
655a807f4b
[Dataset] LongBench (#236)
Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>
2023-08-21 14:15:20 +08:00
Ezra-Yu
17ccaa5980
[Feat] Add codegeex2 and Humanevalx (#210)
* add codegeex2

* add humanevalx dataset

* add evaluator

* update evaluator

* update configs

* update clean code

* update configs

* fix lint

* remove sleep

* fix lint

* update docs

* fix lint
2023-08-17 11:03:16 +08:00
Hubert
0fe2366a72
[Feat] support adv_glue dataset for adversarial robustness (#205)
* [Feat] support adv_glue dataset for adversarial robustness

* reorg files

* minor fix

* minor fix
2023-08-16 18:42:06 +08:00
Tong Gao
bf79ff1c6d
[Feature] Add LEval datasets
Co-authored-by: kennymckormick <dhd@pku.edu.cn>
2023-08-11 17:38:31 +08:00
Leymore
14332e08fd
[Feature] add llama-oriented dataset configs (#82)
* add llama-oriented dataset configs

* update

* revert cvalues & update llama_example
2023-08-11 12:48:05 +08:00
Tong Gao
2931f3dcb8
[Enhancement] Add humaneval postprocessor for GPT models & eval config for GPT4, enhance the original humaneval postprocessor (#129)
* [Enhancement] Enhance humaneval postprocessor

* add human-eval testcase

* update

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-08-10 16:31:12 +08:00
Leymore
e7fc54baf1
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
* add Xiezhi SQuAD2.0 ANLI; update WSC

* update

* update

* update doc string
2023-08-10 14:04:18 +08:00
Leymore
876ade71a5
[Fix] Fix AGIEval multiple choice (#137)
* update agieval data

* rename variables
2023-08-10 11:38:24 +08:00
Tong Gao
c00179d46b
[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130)
* [Feature] Support evaluating acc based on minimum edit distance, update SIQA

* update
2023-08-01 14:24:27 +08:00
Hubert
b7184e9db5
[Refactor] Update crows-pairs evaluation (#98)
* [Refactor] Update crows-pairs evaluation

* [Refactor] Update crows-pairs evaluation

* minor
2023-07-26 11:21:32 +08:00
Haonan Li
e9cdb24ddd
[Feature] Add CMMLU dataset (#91)
* add CMMLU

* debug cmmlu

* add slurm args `qos`

* fix format: space before comment

* remove unused variable

* change the location of `answer is`

---------

Co-authored-by: 李浩楠 <lihaonan@lihaonandeMacBook-Air.local>
Co-authored-by: 李浩楠 <haonan.li>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-07-25 10:14:27 +08:00
Hubert
f83e125e5a
[Feat] Support CValues Responsibility dataset (#78)
* [Feat] support CValues

* minor fix
2023-07-18 18:45:15 +08:00
liushz
f36c0496f3
[Feature] Add tydiqa-goldp (#75)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2023-07-18 14:54:35 +08:00
Tong Gao
1e44541730
[Enhancement] Test linting in CI and fix existing linting errors (#69)
* [Enhancement] Test linting in CI

* fix linting
2023-07-17 15:59:10 +08:00
Leymore
1326aff77e
[Feature] Add logger info and remove dataset bugs (#61)
* Add logger info and remove dataset bugs

* fix typo
2023-07-17 14:26:30 +08:00
Leymore
86d5ec3d0f
Update configs (#9)
* Update implements

* Update
2023-07-06 12:27:41 +08:00
mzr1996
04dd01a235 Update configs and code 2023-07-05 11:45:08 +08:00
Leymore
c94cc94348 Add release contribution 2023-07-05 03:15:31 +00:00
Ezra-Yu
cbe9fe2cdb Add Release Contraibution 2023-07-05 02:22:40 +00:00
cky
36f111100f update datasets 2023-07-05 01:45:26 +00:00
mzr1996
3cfe73de3f Support a batch of datasets. 2023-07-05 01:30:27 +00:00
kennymckormick
78478e961e [Code] Update opencompass/datasets/agieval/__init__.py 2023-07-05 00:28:07 +00:00
yingfhu
fb11108723 [Feat] support opencompass 2023-07-04 22:11:33 +08:00
gaotongxiao
7d346000bb initial commit 2023-07-04 21:34:55 +08:00