OpenCompass/opencompass/configs/datasets/omni_math/README.md

# Omni-Math

[Omni-Math](https://huggingface.co/datasets/KbsdJames/Omni-MATH) contains 4428 competition-level problems. These problems are meticulously categorized into 33 (and potentially more) sub-domains and span across 10 distinct difficulty levels, enabling a nuanced analysis of model performance across various mathematical disciplines and levels of complexity.

* Project Page: https://omni-math.github.io/
* Github Repo: https://github.com/KbsdJames/Omni-MATH
* Omni-Judge (opensource evaluator of this dataset): https://huggingface.co/KbsdJames/Omni-Judge

## Omni-Judge

> Omni-Judge is an open-source mathematical evaluation model designed to assess whether a solution generated by a model is correct given a problem and a standard answer. 

You should deploy the omni-judge server like:
```bash
set -x

lmdeploy serve api_server KbsdJames/Omni-Judge --server-port 8000 \
    --tp 1 \
    --cache-max-entry-count 0.9 \
    --log-level INFO
```

and set the server url in opencompass config file:

```python
from mmengine.config import read_base

with read_base():
    from opencompass.configs.datasets.omni_math.omni_math_gen import omni_math_datasets


omni_math_dataset = omni_math_datasets[0]
omni_math_dataset['eval_cfg']['evaluator'].update(
    url=['http://172.30.8.45:8000',
         'http://172.30.16.113:8000'],
)
```

## Performance

| llama-3_1-8b-instruct | qwen-2_5-7b-instruct | InternLM3-8b-Instruct |
| -- | -- | -- |
| 15.18 | 29.97 | 32.75 |
[Feature] Support Omni-Math (#1837) * support omni-math * update config * upload README * Delete opencompass/configs/datasets/omni_math/__init__.py --------- Co-authored-by: liushz <qq1791167085@163.com> 2025-01-23 18:36:54 +08:00			`# Omni-Math`

			`[Omni-Math](https://huggingface.co/datasets/KbsdJames/Omni-MATH) contains 4428 competition-level problems. These problems are meticulously categorized into 33 (and potentially more) sub-domains and span across 10 distinct difficulty levels, enabling a nuanced analysis of model performance across various mathematical disciplines and levels of complexity.`

			`* Project Page: https://omni-math.github.io/`
			`* Github Repo: https://github.com/KbsdJames/Omni-MATH`
			`* Omni-Judge (opensource evaluator of this dataset): https://huggingface.co/KbsdJames/Omni-Judge`

			`## Omni-Judge`

			`> Omni-Judge is an open-source mathematical evaluation model designed to assess whether a solution generated by a model is correct given a problem and a standard answer.`

			`You should deploy the omni-judge server like:`
			```bash
			`set -x`

			`lmdeploy serve api_server KbsdJames/Omni-Judge --server-port 8000 \`
			`--tp 1 \`
			`--cache-max-entry-count 0.9 \`
			`--log-level INFO`
			```

			`and set the server url in opencompass config file:`

			```python
			`from mmengine.config import read_base`

			`with read_base():`
			`from opencompass.configs.datasets.omni_math.omni_math_gen import omni_math_datasets`


			`omni_math_dataset = omni_math_datasets[0]`
			`omni_math_dataset['eval_cfg']['evaluator'].update(`
			`url=['http://172.30.8.45:8000',`
			`'http://172.30.16.113:8000'],`
			`)`
			```

			`## Performance`

			`\| llama-3_1-8b-instruct \| qwen-2_5-7b-instruct \| InternLM3-8b-Instruct \|`
			`\| -- \| -- \| -- \|`
			`\| 15.18 \| 29.97 \| 32.75 \|`