Junnan Liu
|
20660ab507
|
[Fix] Fix compare error when k is list in base_evaluator (#2010)
* fix gpass compare error of list k
* fix compare error in 177
|
2025-04-10 19:47:21 +08:00 |
|
Linchen Xiao
|
12213207b6
|
[Refactor] Refactorize openicl eval task (#1990)
* [Refactor] Refactorize openicl eval task
* update
|
2025-04-09 15:52:23 +08:00 |
|
Linchen Xiao
|
db96161a4e
|
[Update] Add SuperGPQA subset metrics (#1966)
|
2025-03-24 14:25:12 +08:00 |
|
Linchen Xiao
|
854c6bf025
|
[Update] Update requirement and base evaluator
|
2025-03-13 20:52:50 +08:00 |
|
Kangreen
|
59e49aedf1
|
[Feature] Support SuperGPQA (#1924)
* support supergpqa
* remove unnecessary code
* remove unnecessary code
* Add Readme
* Add Readme
* fix lint
* fix lint
* update
* update
---------
Co-authored-by: mkj3085003 <mkj3085003@gmail.com>
Co-authored-by: MaiziXiao <xxllcc1993@gmail.com>
|
2025-03-11 19:32:08 +08:00 |
|
Junnan Liu
|
73c80953c6
|
[Feature] Support Dataset Repeat and G-Pass Compute for Each Evaluator (#1886)
* support dataset repeat and g-pass compute for each evaluator
* fix pre-commit errors
* delete print
* delete gpassk_evaluator and fix potential errors
* change `repeat` to `n`
* fix `repeat` to `n` in openicl_eval
* update doc for multi-run and g-pass
* update latex equation in doc
* update eng doc for multi-run and g-pass
* update datasets.md
* update datasets.md
* fix multi-line equation
* fix multi-line equation
* fix multi-line equation
* fix multi-line equation
* fix multi-line equation
* fix multi-line equation
* fix multi-line equation in zh_cn user_guides
* mmodify pre-commit-zh-cn
* recover pre-commit and edit math expr in doc
* del [TIP]
* del cite tag in doc
* del extract_model param in livemathbench config
|
2025-02-26 19:43:12 +08:00 |
|
Songyang Zhang
|
a4d5a6c81b
|
[Feature] Support LiveCodeBench (#1617)
* Update
* Update LCB
* Update
* Update
* Update
* Update
* Update
|
2024-10-21 20:50:39 +08:00 |
|
Tong Gao
|
1e44541730
|
[Enhancement] Test linting in CI and fix existing linting errors (#69)
* [Enhancement] Test linting in CI
* fix linting
|
2023-07-17 15:59:10 +08:00 |
|
cky
|
36f111100f
|
update datasets
|
2023-07-05 01:45:26 +00:00 |
|