[Update] Fix Hard Configs With General GPassK (#1906)

* support dataset repeat and g-pass compute for each evaluator

* fix pre-commit errors

* delete print

* delete gpassk_evaluator and fix potential errors

* change `repeat` to `n`

* fix `repeat` to `n` in openicl_eval

* update doc for multi-run and g-pass

* update latex equation in doc

* update eng doc for multi-run and g-pass

* update datasets.md

* update datasets.md

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation

* fix multi-line equation in zh_cn user_guides

* mmodify pre-commit-zh-cn

* recover pre-commit and edit math expr in doc

* del [TIP]

* del cite tag in doc

* del extract_model param in livemathbench config

* fix livemathbench hard configs
This commit is contained in:
Junnan Liu 2025-03-03 18:17:15 +08:00 committed by GitHub
parent 6a573f671b
commit f0809fe6f6
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 4 additions and 16 deletions

View File

@ -9,7 +9,7 @@ livemathbench_dataset = dict(
type=LiveMathBenchDataset,
path='',
k=16,
replication=3,
n=48,
dataset_splits=['hard'],
dataset_languages=['cn', 'en'],
cot=True,
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
evaluator=dict(
type=LiveMathBenchEvaluator,
model_name='',
url=[],
use_extract_model=False,
extract_url=[],
extract_model_name='',
k=[4, 8, 16],
replication=3,
thresholds=[0.0, 0.25, 0.5, 0.75, 1.0]
url=[]
)
)
)

View File

@ -9,7 +9,7 @@ livemathbench_dataset = dict(
type=LiveMathBenchDataset,
path='',
k=1,
replication=1,
n=1,
dataset_splits=['hard'],
dataset_languages=['cn', 'en'],
cot=True,
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
evaluator=dict(
type=LiveMathBenchEvaluator,
model_name='',
url=[],
use_extract_model=False,
extract_url=[],
extract_model_name='',
k=[1],
replication=1,
thresholds=[0.0]
url=[]
)
)
)