From 7c4020bdb5c0d6d976e27606c7e27d7462317228 Mon Sep 17 00:00:00 2001 From: MaiziXiao Date: Tue, 11 Mar 2025 09:18:28 +0000 Subject: [PATCH] Add Readme --- README.md | 1 + README_zh-CN.md | 1 + dataset-index.yml | 8 +++++--- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 93c2a5fd..4a29f2b7 100644 --- a/README.md +++ b/README.md @@ -57,6 +57,7 @@ Just like a compass guides us on our journey, OpenCompass will guide you through ## 🚀 What's New +- **\[2025.03.11\]** We have supported evaluation for `SuperGPQA` which is a great benchmark for measuring LLM knowledge ability 🔥🔥🔥 - **\[2025.02.28\]** We have added a tutorial for `DeepSeek-R1` series model, please check [Evaluating Reasoning Model](docs/en/user_guides/deepseek_r1.md) for more details! 🔥🔥🔥 - **\[2025.02.15\]** We have added two powerful evaluation tools: `GenericLLMEvaluator` for LLM-as-judge evaluations and `MATHEvaluator` for mathematical reasoning assessments. Check out the documentation for [LLM Judge](docs/en/advanced_guides/llm_judge.md) and [Math Evaluation](docs/en/advanced_guides/general_math.md) for more details! 🔥🔥🔥 - **\[2025.01.16\]** We now support the [InternLM3-8B-Instruct](https://huggingface.co/internlm/internlm3-8b-instruct) model which has enhanced performance on reasoning and knowledge-intensive tasks. diff --git a/README_zh-CN.md b/README_zh-CN.md index 55c2faf5..b5e388fc 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -57,6 +57,7 @@ ## 🚀 最新进展 +- **\[2025.03.11\]** 现已支持 `SuperGPQA` LLM知识能力评测 - **\[2025.02.28\]** 我们为 `DeepSeek-R1` 系列模型添加了教程,请查看 [评估推理模型](docs/en/user_guides/deepseek_r1.md) 了解更多详情!🔥🔥🔥 - **\[2025.02.15\]** 我们新增了两个实用的评测工具:用于LLM作为评判器的`GenericLLMEvaluator`和用于数学推理评估的`MATHEvaluator`。查看[LLM评判器](docs/zh_cn/advanced_guides/llm_judge.md)和[数学能力评测](docs/zh_cn/advanced_guides/general_math.md)文档了解更多详情!🔥🔥🔥 - **\[2025.01.16\]** 我们现已支持 [InternLM3-8B-Instruct](https://huggingface.co/internlm/internlm3-8b-instruct) 模型,该模型在推理、知识类任务上取得同量级最优性能,欢迎尝试。 diff --git a/dataset-index.yml b/dataset-index.yml index b8ec7041..f72e7362 100644 --- a/dataset-index.yml +++ b/dataset-index.yml @@ -734,6 +734,8 @@ category: Understanding paper: https://arxiv.org/pdf/1808.08745 configpath: opencompass/configs/datasets/Xsum - - - +- supergpqa: + name: SuperGPQA + category: Knowledge + paper: https://arxiv.org/pdf/2502.14739 + configpath: opencompass/configs/datasets/supergpqa