OpenCompass/opencompass/configs/datasets/livereasonbench
Songyang Zhang 98435dd98e
[Feature] Update o1 evaluation with JudgeLLM (#1795)
* Update Generic LLM Evaluator

* Update o1 style evaluator
2024-12-30 17:31:00 +08:00
..
livereasonbench_gen_f990de.py [Update] Update Skywork/Qwen-QwQ (#1728) 2024-12-05 19:30:43 +08:00
livereasonbench_gen.py [Update] Update Skywork/Qwen-QwQ (#1728) 2024-12-05 19:30:43 +08:00
livereasonbench_genericllmeval_gen_f990de.py [Feature] Update o1 evaluation with JudgeLLM (#1795) 2024-12-30 17:31:00 +08:00