[Update] Fix Hard Configs With General GPassK (#1906)

* support dataset repeat and g-pass compute for each evaluator * fix pre-commit errors * delete print * delete gpassk_evaluator and fix potential errors * change `repeat` to `n` * fix `repeat` to `n` in openicl_eval * update doc for multi-run and g-pass * update latex equation in doc * update eng doc for multi-run and g-pass * update datasets.md * update datasets.md * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation in zh_cn user_guides * mmodify pre-commit-zh-cn * recover pre-commit and edit math expr in doc * del [TIP] * del cite tag in doc * del extract_model param in livemathbench config * fix livemathbench hard configs
2025-05-30 16:03:24 +08:00 · 2025-03-03 18:17:15 +08:00 · 2025-03-03 18:17:15 +08:00 · f0809fe6f6
commit f0809fe6f6
parent 6a573f671b
2 changed files with 4 additions and 16 deletions
--- a/opencompass/configs/datasets/livemathbench/livemathbench_hard_gen_353ae7.py
+++ b/opencompass/configs/datasets/livemathbench/livemathbench_hard_gen_353ae7.py
@ -9,7 +9,7 @@ livemathbench_dataset = dict(
    type=LiveMathBenchDataset,
    path='',
    k=16,
-    replication=3,
+    n=48,
    dataset_splits=['hard'],
    dataset_languages=['cn', 'en'],
    cot=True,
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
        evaluator=dict(
            type=LiveMathBenchEvaluator,
            model_name='',
-            url=[],
+            url=[]
            use_extract_model=False,
            extract_url=[],
            extract_model_name='',
            k=[4, 8, 16],
            replication=3,
            thresholds=[0.0, 0.25, 0.5, 0.75, 1.0]
        )
    )
 )
--- a/opencompass/configs/datasets/livemathbench/livemathbench_hard_greedy_gen_353ae7.py
+++ b/opencompass/configs/datasets/livemathbench/livemathbench_hard_greedy_gen_353ae7.py
@ -9,7 +9,7 @@ livemathbench_dataset = dict(
    type=LiveMathBenchDataset,
    path='',
    k=1,
-    replication=1,
+    n=1,
    dataset_splits=['hard'],
    dataset_languages=['cn', 'en'],
    cot=True,
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
        evaluator=dict(
            type=LiveMathBenchEvaluator,
            model_name='',
-            url=[],
+            url=[]
            use_extract_model=False,
            extract_url=[],
            extract_model_name='',
            k=[1],
            replication=1,
            thresholds=[0.0]
        )
    )
 )