..
deprecated_math_agent_evaluatorv2_gen_861b4f.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
deprecated_math_evaluatorv2_gen_265cce.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_0shot_gen_11c4b5.py
[Update] Update dataset configuration with no max_out_len ( #1754 )
2024-12-11 18:20:29 +08:00
math_0shot_gen_393424.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_0shot_llm_judge_gen_393424.py
[Feature] Add GaoKaoMath Dataset for Evaluation & MATH Model Eval Config ( #1589 )
2024-10-12 19:13:06 +08:00
math_0shot_llm_judge_v2_gen_31d777.py
[Update] Update MATH dataset with model judge ( #1711 )
2024-11-25 15:14:55 +08:00
math_4shot_base_gen_43d5b6.py
[Feature] Update MathBench & Math base model config ( #1550 )
2024-09-23 14:03:59 +08:00
math_4shot_base_gen_db136b.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_4shot_example_from_google_research.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_500_gen.py
[Feature] Math Verify with model post_processor ( #1881 )
2025-02-20 19:32:12 +08:00
math_agent_evaluatorv2_gen_0c1b4e.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_agent_gen_0c1b4e.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_agent_gen_861b4f.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_agent_gen_af2293.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_evaluatorv2_gen_2f4a71.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_evaluatorv2_gen_cecb31.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_1ed9c2.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_5e8458.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_78ced2.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_265cce.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_943d32.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_0957ff.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_559593.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen_736506.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_gen.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_intern_evaluator_gen_265cce.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_llm_judge.py
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00
math_prm800k_500_0shot_cot_academic_gen.py
[Update] Academic bench llm judge update ( #1876 )
2025-02-24 15:45:24 +08:00
math_prm800k_500_0shot_cot_gen.py
[Update] Update OC academic 202412 ( #1771 )
2024-12-19 18:07:34 +08:00
math_prm800k_500_0shot_nocot_gen_b27274.py
[Update] Update Fullbench ( #1712 )
2024-11-26 14:26:55 +08:00
math_prm800k_500_0shot_nocot_genericllmeval_gen_63a000.py
[Feature] Update o1 evaluation with JudgeLLM ( #1795 )
2024-12-30 17:31:00 +08:00
math_prm800k_500_0shot_nocot_genericllmeval_xml_gen_63a000.py
[Refactor] Code refactoarization ( #1831 )
2025-01-20 19:17:38 +08:00
math_prm800k_500_0shot_nocot_llmjudge_gen_63a000.py
[Update] Update Skywork/Qwen-QwQ ( #1728 )
2024-12-05 19:30:43 +08:00
math_prm800k_500_gen.py
[Update] Add math prm 800k ( #1708 )
2024-11-21 21:29:43 +08:00
math_prm800k_500_llmverify_gen_6ff468.py
[Update] Support AIME-24 Evaluation for DeepSeek-R1 series ( #1888 )
2025-02-25 20:34:41 +08:00
math_prm800k_500_llmverify_repeat4_gen_97b203.py
[Update] Support OlympiadBench-Math/OmniMath/LiveMathBench-Hard ( #1899 )
2025-03-03 18:56:11 +08:00
README.md
[Feature] Support import configs/models/summarizers from whl ( #1376 )
2024-08-01 00:42:48 +08:00