OpenCompass/opencompass/datasets/livereasonbench
Songyang Zhang 8fdb72f567
[Update] Update o1 eval prompt (#1806)
* Update XML prediction post-process

* Update LiveMathBench

* Update LiveMathBench

* Update New O1 Evaluation
2025-01-07 00:14:32 +08:00
..
__init__.py [Update] Update Skywork/Qwen-QwQ (#1728) 2024-12-05 19:30:43 +08:00
livereasonbench.py [Update] Update o1 eval prompt (#1806) 2025-01-07 00:14:32 +08:00