mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00
[Update] Fix Hard Configs With General GPassK (#1906)
* support dataset repeat and g-pass compute for each evaluator * fix pre-commit errors * delete print * delete gpassk_evaluator and fix potential errors * change `repeat` to `n` * fix `repeat` to `n` in openicl_eval * update doc for multi-run and g-pass * update latex equation in doc * update eng doc for multi-run and g-pass * update datasets.md * update datasets.md * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation * fix multi-line equation in zh_cn user_guides * mmodify pre-commit-zh-cn * recover pre-commit and edit math expr in doc * del [TIP] * del cite tag in doc * del extract_model param in livemathbench config * fix livemathbench hard configs
This commit is contained in:
parent
6a573f671b
commit
f0809fe6f6
@ -9,7 +9,7 @@ livemathbench_dataset = dict(
|
|||||||
type=LiveMathBenchDataset,
|
type=LiveMathBenchDataset,
|
||||||
path='',
|
path='',
|
||||||
k=16,
|
k=16,
|
||||||
replication=3,
|
n=48,
|
||||||
dataset_splits=['hard'],
|
dataset_splits=['hard'],
|
||||||
dataset_languages=['cn', 'en'],
|
dataset_languages=['cn', 'en'],
|
||||||
cot=True,
|
cot=True,
|
||||||
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
|
|||||||
evaluator=dict(
|
evaluator=dict(
|
||||||
type=LiveMathBenchEvaluator,
|
type=LiveMathBenchEvaluator,
|
||||||
model_name='',
|
model_name='',
|
||||||
url=[],
|
url=[]
|
||||||
use_extract_model=False,
|
|
||||||
extract_url=[],
|
|
||||||
extract_model_name='',
|
|
||||||
k=[4, 8, 16],
|
|
||||||
replication=3,
|
|
||||||
thresholds=[0.0, 0.25, 0.5, 0.75, 1.0]
|
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
@ -9,7 +9,7 @@ livemathbench_dataset = dict(
|
|||||||
type=LiveMathBenchDataset,
|
type=LiveMathBenchDataset,
|
||||||
path='',
|
path='',
|
||||||
k=1,
|
k=1,
|
||||||
replication=1,
|
n=1,
|
||||||
dataset_splits=['hard'],
|
dataset_splits=['hard'],
|
||||||
dataset_languages=['cn', 'en'],
|
dataset_languages=['cn', 'en'],
|
||||||
cot=True,
|
cot=True,
|
||||||
@ -37,13 +37,7 @@ livemathbench_dataset = dict(
|
|||||||
evaluator=dict(
|
evaluator=dict(
|
||||||
type=LiveMathBenchEvaluator,
|
type=LiveMathBenchEvaluator,
|
||||||
model_name='',
|
model_name='',
|
||||||
url=[],
|
url=[]
|
||||||
use_extract_model=False,
|
|
||||||
extract_url=[],
|
|
||||||
extract_model_name='',
|
|
||||||
k=[1],
|
|
||||||
replication=1,
|
|
||||||
thresholds=[0.0]
|
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
Loading…
Reference in New Issue
Block a user