Commit Graph

9 Commits

Author SHA1 Message Date
Junnan Liu
20660ab507
[Fix] Fix compare error when k is list in base_evaluator (#2010)
* fix gpass compare error of list k

* fix compare error in 177
2025-04-10 19:47:21 +08:00
Linchen Xiao
12213207b6
[Refactor] Refactorize openicl eval task (#1990)
* [Refactor] Refactorize openicl eval task

* update
2025-04-09 15:52:23 +08:00
Linchen Xiao
db96161a4e
[Update] Add SuperGPQA subset metrics (#1966) 2025-03-24 14:25:12 +08:00
Linchen Xiao
854c6bf025
[Update] Update requirement and base evaluator 2025-03-13 20:52:50 +08:00
Kangreen
59e49aedf1
[Feature] Support SuperGPQA (#1924)
* support supergpqa

* remove unnecessary code

* remove unnecessary code

* Add Readme

* Add Readme

* fix lint

* fix lint

* update

* update

---------

Co-authored-by: mkj3085003 <mkj3085003@gmail.com>
Co-authored-by: MaiziXiao <xxllcc1993@gmail.com>
2025-03-11 19:32:08 +08:00
Junnan Liu
73c80953c6
[Feature] Support Dataset Repeat and G-Pass Compute for Each Evaluator (#1886)
* support dataset repeat and g-pass compute for each evaluator

* fix pre-commit errors

* delete print

* delete gpassk_evaluator and fix potential errors

* change `repeat` to `n`

* fix `repeat` to `n` in openicl_eval

* update doc for multi-run and g-pass

* update latex equation in doc

* update eng doc for multi-run and g-pass

* update datasets.md

* update datasets.md

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation in zh_cn user_guides

* mmodify pre-commit-zh-cn

* recover pre-commit and edit math expr in doc

* del [TIP]

* del cite tag in doc

* del extract_model param in livemathbench config
2025-02-26 19:43:12 +08:00
Songyang Zhang
a4d5a6c81b
[Feature] Support LiveCodeBench (#1617)
* Update

* Update LCB

* Update

* Update

* Update

* Update

* Update
2024-10-21 20:50:39 +08:00
Tong Gao
1e44541730
[Enhancement] Test linting in CI and fix existing linting errors (#69)
* [Enhancement] Test linting in CI

* fix linting
2023-07-17 15:59:10 +08:00
cky
36f111100f update datasets 2023-07-05 01:45:26 +00:00