diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml index bc6d36a7..106c2453 100644 --- a/.github/workflows/lint.yml +++ b/.github/workflows/lint.yml @@ -17,7 +17,7 @@ jobs: python-version: '3.10' - name: Install pre-commit hook run: | - pip install pre-commit==3.8.0 mmengine + pip install pre-commit==3.8.0 mmengine==0.10.5 pre-commit install - name: Linting run: pre-commit run --all-files diff --git a/README.md b/README.md index 6d8cabe5..0cedafa2 100644 --- a/README.md +++ b/README.md @@ -57,6 +57,7 @@ Just like a compass guides us on our journey, OpenCompass will guide you through ## 🚀 What's New +- **\[2025.01.16\]** We now support the [InternLM3-8B-Instruct](https://huggingface.co/internlm/internlm3-8b-instruct) model which has enhanced performance on reasoning and knowledge-intensive tasks. - **\[2024.12.17\]** We have provided the evaluation script for the December [CompassAcademic](configs/eval_academic_leaderboard_202412.py), which allows users to easily reproduce the official evaluation results by configuring it. - **\[2024.11.14\]** OpenCompass now offers support for a sophisticated benchmark designed to evaluate complex reasoning skills — [MuSR](https://arxiv.org/pdf/2310.16049). Check out the [demo](configs/eval_musr.py) and give it a spin! 🔥🔥🔥 - **\[2024.11.14\]** OpenCompass now supports the brand new long-context language model evaluation benchmark — [BABILong](https://arxiv.org/pdf/2406.10149). Have a look at the [demo](configs/eval_babilong.py) and give it a try! 🔥🔥🔥 diff --git a/README_zh-CN.md b/README_zh-CN.md index 21c0d666..a01caee7 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -57,6 +57,7 @@ ## 🚀 最新进展 +- **\[2025.01.16\]** 我们现已支持 [InternLM3-8B-Instruct](https://huggingface.co/internlm/internlm3-8b-instruct) 模型,该模型在推理、知识类任务上取得同量级最优性能,欢迎尝试。 - **\[2024.12.17\]** 我们提供了12月CompassAcademic学术榜单评估脚本 [CompassAcademic](configs/eval_academic_leaderboard_202412.py),你可以通过简单地配置复现官方评测结果。 - **\[2024.10.14\]** 现已支持OpenAI多语言问答数据集[MMMLU](https://huggingface.co/datasets/openai/MMMLU),欢迎尝试! 🔥🔥🔥 - **\[2024.09.19\]** 现已支持[Qwen2.5](https://huggingface.co/Qwen)(0.5B to 72B) ,可以使用多种推理后端(huggingface/vllm/lmdeploy), 欢迎尝试! 🔥🔥🔥 diff --git a/opencompass/configs/models/hf_internlm/hf_internlm3_8b_instruct.py b/opencompass/configs/models/hf_internlm/hf_internlm3_8b_instruct.py new file mode 100644 index 00000000..146618a8 --- /dev/null +++ b/opencompass/configs/models/hf_internlm/hf_internlm3_8b_instruct.py @@ -0,0 +1,12 @@ +from opencompass.models import HuggingFacewithChatTemplate + +models = [ + dict( + type=HuggingFacewithChatTemplate, + abbr='internlm3-8b-instruct-hf', + path='internlm/internlm3-8b-instruct', + max_out_len=8192, + batch_size=8, + run_cfg=dict(num_gpus=1), + ) +] diff --git a/opencompass/configs/models/hf_internlm/lmdeploy_internlm3_8b_instruct.py b/opencompass/configs/models/hf_internlm/lmdeploy_internlm3_8b_instruct.py new file mode 100644 index 00000000..c905db44 --- /dev/null +++ b/opencompass/configs/models/hf_internlm/lmdeploy_internlm3_8b_instruct.py @@ -0,0 +1,17 @@ +from opencompass.models import TurboMindModelwithChatTemplate + +models = [ + dict( + type=TurboMindModelwithChatTemplate, + abbr='internlm3-8b-instruct-turbomind', + path='internlm/internlm3-8b-instruct', + engine_config=dict(session_len=32768, max_batch_size=16, tp=1), + gen_config=dict( + top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=8192 + ), + max_seq_len=32768, + max_out_len=8192, + batch_size=16, + run_cfg=dict(num_gpus=1), + ) +]