# Omni-Math [Omni-Math](https://huggingface.co/datasets/KbsdJames/Omni-MATH) contains 4428 competition-level problems. These problems are meticulously categorized into 33 (and potentially more) sub-domains and span across 10 distinct difficulty levels, enabling a nuanced analysis of model performance across various mathematical disciplines and levels of complexity. * Project Page: https://omni-math.github.io/ * Github Repo: https://github.com/KbsdJames/Omni-MATH * Omni-Judge (opensource evaluator of this dataset): https://huggingface.co/KbsdJames/Omni-Judge ## Omni-Judge > Omni-Judge is an open-source mathematical evaluation model designed to assess whether a solution generated by a model is correct given a problem and a standard answer. You should deploy the omni-judge server like: ```bash set -x lmdeploy serve api_server KbsdJames/Omni-Judge --server-port 8000 \ --tp 1 \ --cache-max-entry-count 0.9 \ --log-level INFO ``` and set the server url in opencompass config file: ```python from mmengine.config import read_base with read_base(): from opencompass.configs.datasets.omni_math.omni_math_gen import omni_math_datasets omni_math_dataset = omni_math_datasets[0] omni_math_dataset['eval_cfg']['evaluator'].update( url=['http://172.30.8.45:8000', 'http://172.30.16.113:8000'], ) ``` ## Performance | llama-3_1-8b-instruct | qwen-2_5-7b-instruct | InternLM3-8b-Instruct | | -- | -- | -- | | 15.18 | 29.97 | 32.75 |