.. |
cmmlu_0shot_cot_gen_305931.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_0shot_nocot_llmjudge_gen_e1cd9a.py
|
[Update] Update Skywork/Qwen-QwQ (#1728)
|
2024-12-05 19:30:43 +08:00 |
cmmlu_gen_c13365.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_gen.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_llmjudge_gen_e1cd9a.py
|
[Update] Add configurations for llmjudge dataset (#1940)
|
2025-03-13 17:30:04 +08:00 |
cmmlu_ppl_8b9c76.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_ppl_041cbf.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_ppl.py
|
[Feature] Support import configs/models/summarizers from whl (#1376)
|
2024-08-01 00:42:48 +08:00 |
cmmlu_stem_0shot_nocot_gen_3653db.py
|
[Feature] Update o1 evaluation with JudgeLLM (#1795)
|
2024-12-30 17:31:00 +08:00 |
cmmlu_stem_0shot_nocot_llmjudge_gen_3653db.py
|
[Update] Update O1-style Benchmark and Prompts (#1742)
|
2024-12-09 13:48:56 +08:00 |
cmmlu_stem_0shot_nocot_xml_gen_3653db.py
|
[Refactor] Code refactoarization (#1831)
|
2025-01-20 19:17:38 +08:00 |