OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

History

Junnan Liu 22a33d8759 [Update] Update LiveMathBench Hard Configs (#1826 ) * support G-Pass@k and livemathbench * fix bugs * fix comments of GPassKEvaluator * update saved details of GPassKEvaluator * update saved details of GPassKEvaluator * fix eval api configs & update openai_api for ease of debugging * update huggingface path * fix method name of G-Pass@k * fix default value of eval_model_name * refactor G-Pass@k evaluator * log generation params for each backend * fix evaluation resume * add notimplementerror * update livemathbench-hard configs * remove max_out_len from livemathbench_hard_greedy_gen_9befbf.py * remove max_out_len from livemathbench_hard_gen_9befbf.py * rename livemathbench_hard_gen_9befbf.py to livemathbench_hard_gen_353ae7.py * rename livemathbench_hard_greedy_gen_9befbf.py to livemathbench_hard_greedy_gen_353ae7.py * update livemathbench_gen_9befbf.py * remove whitespace * upload livemathbench hard configs		2025-02-25 17:24:36 +08:00
..
dataset_collections	[Doc] Update Readme (#1439 )	2024-08-22 14:48:45 +08:00
datasets	[Update] Update LiveMathBench Hard Configs (#1826 )	2025-02-25 17:24:36 +08:00
models	[Update] Academic bench llm judge update (#1876 )	2025-02-24 15:45:24 +08:00
summarizers	[Feature] Support OlympiadBench Benchmark (#1841 )	2025-01-24 10:00:01 +08:00