mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00
|
||
---|---|---|
.. | ||
aime2024_0shot_nocot_gen_2b9dc2.py | ||
aime2024_0shot_nocot_genericllmeval_gen_2b9dc2.py | ||
aime2024_0shot_nocot_genericllmeval_xml_gen_2b9dc2.py | ||
aime2024_0shot_nocot_llmjudge_gen_2b9dc2.py | ||
aime2024_gen_6e39a4.py | ||
aime2024_gen.py | ||
README.md |
Description
Math dataset composed of problems from AIME2024 (American Invitational Mathematics Examination 2024).
Performance
Qwen2.5-Math-72B-Instruct | Qwen2.5-Math-7B-Instruct | Qwen2-Math-7B-Instruct | Qwen2-Math-1.5B-Instruct | internlm2-math-7b |
---|---|---|---|---|
20.00 | 16.67 | 16.67 | 13.33 | 3.33 |
Qwen2.5-72B-Instruct | Qwen2.5-7B-Instruct | internlm2_5-7b-chat |
---|---|---|
31.25 | 26.44 | 9.13 |