OpenCompass/opencompass/configs/datasets/livemathbench/README.md

# LiveMathBench

## v202412

### Details of Datsets

| dataset | language | #single-choice | #multiple-choice | #fill-in-the-blank | #problem-solving |
| -- | -- | -- | -- | -- | -- |
| AMC | cn | 0 | 0 | 0 | 46 |
| AMC | en | 0 | 0 | 0 | 46 |
| CCEE | cn | 0 | 0 | 13 | 31 |
| CCEE | en | 0 | 0 | 13 | 31 |
| CNMO | cn | 0 | 0 | 0 | 18 |
| CNMO | en | 0 | 0 | 0 | 18 |
| WLPMC | cn | 0 | 0 | 0 | 11 |
| WLPMC | en | 0 | 0 | 0 | 11 |


### How to use

#### G-Pass@k
```python
from mmengine.config import read_base

with read_base():
    from opencompass.datasets.livemathbench_gen import livemathbench_datasets

livemathbench_datasets[0]['eval_cfg']['evaluator'].update(
    {
        'model_name': 'Qwen/Qwen2.5-72B-Instruct', 
        'url': [
            'http://0.0.0.0:23333/v1', 
            '...'
        ]  # set url of evaluation models
    }
)
livemathbench_dataset['infer_cfg']['inferencer'].update(dict(
    max_out_len=32768 # for o1-like models you need to update max_out_len
))

```

#### Greedy
```python
from mmengine.config import read_base

with read_base():
    from opencompass.datasets.livemathbench_greedy_gen import livemathbench_datasets

livemathbench_datasets[0]['eval_cfg']['evaluator'].update(
    {
        'model_name': 'Qwen/Qwen2.5-72B-Instruct', 
        'url': [
            'http://0.0.0.0:23333/v1', 
            '...'
        ]  # set url of evaluation models
    }
)
livemathbench_dataset['infer_cfg']['inferencer'].update(dict(
    max_out_len=32768 # for o1-like models you need to update max_out_len
))

```

### Output Samples

| dataset | version | metric | mode | Qwen2.5-72B-Instruct |
|----- | ----- | ----- | ----- | -----|
| LiveMathBench | 9befbf | G-Pass@16_0.0 | gen | xx.xx |
| LiveMathBench | caed8f | G-Pass@16_0.25 | gen | xx.xx |
| LiveMathBench | caed8f | G-Pass@16_0.5 | gen | xx.xx |
| LiveMathBench | caed8f | G-Pass@16_0.75 | gen | xx.xx |
| LiveMathBench | caed8f | G-Pass@16_1.0 | gen | xx.xx |
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			`# LiveMathBench`

[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`## v202412`

			`### Details of Datsets`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00
			`\| dataset \| language \| #single-choice \| #multiple-choice \| #fill-in-the-blank \| #problem-solving \|`
			`\| -- \| -- \| -- \| -- \| -- \| -- \|`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`\| AMC \| cn \| 0 \| 0 \| 0 \| 46 \|`
			`\| AMC \| en \| 0 \| 0 \| 0 \| 46 \|`
			`\| CCEE \| cn \| 0 \| 0 \| 13 \| 31 \|`
			`\| CCEE \| en \| 0 \| 0 \| 13 \| 31 \|`
			`\| CNMO \| cn \| 0 \| 0 \| 0 \| 18 \|`
			`\| CNMO \| en \| 0 \| 0 \| 0 \| 18 \|`
			`\| WLPMC \| cn \| 0 \| 0 \| 0 \| 11 \|`
			`\| WLPMC \| en \| 0 \| 0 \| 0 \| 11 \|`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00

[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`### How to use`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`#### G-Pass@k`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			```python
			`from mmengine.config import read_base`

			`with read_base():`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`from opencompass.datasets.livemathbench_gen import livemathbench_datasets`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00
			`livemathbench_datasets[0]['eval_cfg']['evaluator'].update(`
			`{`
			`'model_name': 'Qwen/Qwen2.5-72B-Instruct',`
			`'url': [`
			`'http://0.0.0.0:23333/v1',`
			`'...'`
			`] # set url of evaluation models`
			`}`
			`)`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`livemathbench_dataset['infer_cfg']['inferencer'].update(dict(`
			`max_out_len=32768 # for o1-like models you need to update max_out_len`
			`))`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00
			```

[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`#### Greedy`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			```python
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`from mmengine.config import read_base`

			`with read_base():`
			`from opencompass.datasets.livemathbench_greedy_gen import livemathbench_datasets`

[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			`livemathbench_datasets[0]['eval_cfg']['evaluator'].update(`
			`{`
			`'model_name': 'Qwen/Qwen2.5-72B-Instruct',`
			`'url': [`
			`'http://0.0.0.0:23333/v1',`
			`'...'`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`] # set url of evaluation models`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			`}`
			`)`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`livemathbench_dataset['infer_cfg']['inferencer'].update(dict(`
			`max_out_len=32768 # for o1-like models you need to update max_out_len`
			`))`

[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00			```

[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`### Output Samples`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00
			`\| dataset \| version \| metric \| mode \| Qwen2.5-72B-Instruct \|`
			`\|----- \| ----- \| ----- \| ----- \| -----\|`
[Update] Update Greedy Config & README of LiveMathBench (#1862) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py * update greedy config & README of LiveMathBench * update intro for max_out_len * rename livemathbench greedy confi * delete greedy config --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-02-20 19:47:04 +08:00			`\| LiveMathBench \| 9befbf \| G-Pass@16_0.0 \| gen \| xx.xx \|`
			`\| LiveMathBench \| caed8f \| G-Pass@16_0.25 \| gen \| xx.xx \|`
			`\| LiveMathBench \| caed8f \| G-Pass@16_0.5 \| gen \| xx.xx \|`
			`\| LiveMathBench \| caed8f \| G-Pass@16_0.75 \| gen \| xx.xx \|`
			`\| LiveMathBench \| caed8f \| G-Pass@16_1.0 \| gen \| xx.xx \|`
[Feature] Support LiveMathBench (#1727) 2024-11-30 00:07:19 +08:00