[Docs] update readme (#165)

This commit is contained in:
Songyang Zhang 2023-08-08 12:49:04 +08:00 committed by GitHub
parent 6ca2be6626
commit 5b80d83866
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 38 additions and 22 deletions

View File

@ -21,11 +21,13 @@ English | [简体中文](README_zh-CN.md)
👋 join us on <a href="https://twitter.com/intern_lm" target="_blank">Twitter</a>, <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://r.vansin.top/?r=internwx" target="_blank">WeChat</a>
</p>
Welcome to **OpenCompass**!
## 🧭 Welcome
to **OpenCompass**!
Just like a compass guides us on our journey, OpenCompass will guide you through the complex landscape of evaluating large language models. With its powerful algorithms and intuitive interface, OpenCompass makes it easy to assess the quality and effectiveness of your NLP models.
## News
## 🚀 What's New <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>
- **\[2023.08.07\]** We have added a [script](tools/eval_mmbench.py) for users to evaluate the inference results of [MMBench](https://opencompass.org.cn/MMBench)-dev. 🔥🔥🔥.
- **\[2023.08.05\]** We have supported [GPT-4](https://openai.com/gpt-4) and [Qwen-7B](https://github.com/QwenLM/Qwen-7B)! Go to our [leaderboard](https://opencompass.org.cn/leaderboard-llm) for more results! More models are welcome to join OpenCompass. 🔥🔥🔥.
@ -34,7 +36,7 @@ Just like a compass guides us on our journey, OpenCompass will guide you through
- **\[2023.07.19\]** We have supported [Llama-2](https://ai.meta.com/llama/)! Its performance report will be available soon. \[[Doc](./docs/en/get_started.md#Installation)\] 🔥🔥🔥.
- **\[2023.07.13\]** We release [MMBench](https://opencompass.org.cn/MMBench), a meticulously curated dataset to comprehensively evaluate different abilities of multimodality models 🔥🔥🔥.
## Introduction
## Introduction
OpenCompass is a one-stop platform for large model evaluation, aiming to provide a fair, open, and reproducible benchmark for large model evaluation. Its main features includes:
@ -48,13 +50,13 @@ OpenCompass is a one-stop platform for large model evaluation, aiming to provide
- **Experiment management and reporting mechanism**: Use config files to fully record each experiment, support real-time reporting of results.
## Leaderboard
## 📊 Leaderboard
We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for community to rank all public models and API models. If you would like to join the evaluation, please provide the model repository URL or a standard API interface to the email address `opencompass@pjlab.org.cn`.
[![image](https://github.com/InternLM/opencompass/assets/13503330/80c5a42c-ddf0-4c6f-b39e-c175711ac381)](https://opencompass.org.cn/rank)
<p align="right"><a href="#top">🔝Back to top</a></p>
## Dataset Support
## 📖 Dataset Support
<table align="center">
<tbody>
@ -241,7 +243,9 @@ We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for commun
</tbody>
</table>
## Model Support
<p align="right"><a href="#top">🔝Back to top</a></p>
## 📖 Model Support
<table align="center">
<tbody>
@ -293,7 +297,7 @@ We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for commun
</tbody>
</table>
## Installation
## 🛠️ Installation
Below are the steps for quick installation and datasets preparation.
@ -310,19 +314,21 @@ unzip OpenCompassData.zip
Some third-party features, like Humaneval and Llama, may require additional steps to work properly, for detailed steps please refer to the [Installation Guide](https://opencompass.readthedocs.io/en/latest/get_started.html).
## Evaluation
<p align="right"><a href="#top">🔝Back to top</a></p>
## 🏗️ Evaluation
Make sure you have installed OpenCompass correctly and prepared your datasets according to the above steps. Please read the [Quick Start](https://opencompass.readthedocs.io/en/latest/get_started.html#quick-start) to learn how to run an evaluation task.
For more tutorials, please check our [Documentation](https://opencompass.readthedocs.io/en/latest/index.html).
## Acknowledgements
## 🤝 Acknowledgements
Some code in this project is cited and modified from [OpenICL](https://github.com/Shark-NLP/OpenICL).
Some datasets and prompt implementations are modified from [chain-of-thought-hub](https://github.com/FranxYao/chain-of-thought-hub) and [instruct-eval](https://github.com/declare-lab/instruct-eval).
## Citation
## 🖊️ Citation
```bibtex
@misc{2023opencompass,
@ -332,3 +338,5 @@ Some datasets and prompt implementations are modified from [chain-of-thought-hub
year={2023}
}
```
<p align="right"><a href="#top">🔝Back to top</a></p>

View File

@ -21,11 +21,13 @@
👋 加入我们的<a href="https://twitter.com/intern_lm" target="_blank">推特</a><a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a><a href="https://r.vansin.top/?r=internwx" target="_blank">微信社区</a>
</p>
欢迎来到OpenCompass
## 🧭 欢迎
来到**OpenCompass**
就像指南针在我们的旅程中为我们导航一样我们希望OpenCompass能够帮助你穿越评估大型语言模型的重重迷雾。OpenCompass提供丰富的算法和功能支持期待OpenCompass能够帮助社区更便捷地对NLP模型的性能进行公平全面的评估。
## 更新
## 🚀 最新进展 <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>
- **\[2023.08.07\]** 新增了 [MMBench 评测脚本](tools/eval_mmbench.py) 以支持用户自行获取 [MMBench](https://opencompass.org.cn/MMBench)-dev 的测试结果. 🔥🔥🔥.
- **\[2023.08.05\]** [GPT-4](https://openai.com/gpt-4) 与 [Qwen-7B](https://github.com/QwenLM/Qwen-7B) 的评测结果已更新在 OpenCompass [大语言模型评测榜单](https://opencompass.org.cn/leaderboard-llm)! 🔥🔥🔥.
@ -34,7 +36,7 @@
- **\[2023.07.19\]** 新增了 [Llama-2](https://ai.meta.com/llama/)!我们近期将会公布其评测结果。\[[文档](./docs/zh_cn/get_started.md#安装)\] 🔥🔥🔥。
- **\[2023.07.13\]** 发布了 [MMBench](https://opencompass.org.cn/MMBench),该数据集经过细致整理,用于评测多模态模型全方位能力 🔥🔥🔥。
## 介绍
## 介绍
OpenCompass 是面向大模型评测的一站式平台。其主要特点如下:
@ -50,13 +52,13 @@ OpenCompass 是面向大模型评测的一站式平台。其主要特点如下
- **灵活化拓展**想增加新模型或数据集想要自定义更高级的任务分割策略甚至接入新的集群管理系统OpenCompass 的一切均可轻松扩展!
## 性能榜单
## 📊 性能榜单
我们将陆续提供开源模型和API模型的具体性能榜单请见 [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) 。如需加入评测,请提供模型仓库地址或标准的 API 接口至邮箱 `opencompass@pjlab.org.cn`.
[![image](https://github.com/InternLM/opencompass/assets/13503330/76237116-a9dd-4207-abef-7ff73b89568a)](https://opencompass.org.cn/rank)
<p align="right"><a href="#top">🔝返回顶部</a></p>
## 数据集支持
## 📖 数据集支持
<table align="center">
<tbody>
@ -243,7 +245,9 @@ OpenCompass 是面向大模型评测的一站式平台。其主要特点如下
</tbody>
</table>
## 模型支持
<p align="right"><a href="#top">🔝返回顶部</a></p>
## 📖 模型支持
<table align="center">
<tbody>
@ -293,7 +297,7 @@ OpenCompass 是面向大模型评测的一站式平台。其主要特点如下
</tbody>
</table>
## 安装
## 🛠️ 安装
下面展示了快速安装以及准备数据集的步骤。
@ -310,19 +314,21 @@ unzip OpenCompassData.zip
有部分第三方功能,如 Humaneval 以及 Llama,可能需要额外步骤才能正常运行,详细步骤请参考[安装指南](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html)。
## 评测
<p align="right"><a href="#top">🔝返回顶部</a></p>
## 🏗️ ️评测
确保按照上述步骤正确安装 OpenCompass 并准备好数据集后,请阅读[快速上手](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id3)了解如何运行一个评测任务。
更多教程请查看我们的[文档](https://opencompass.readthedocs.io/zh_CN/latest/index.html)。
## 致谢
## 🤝 致谢
该项目部分的代码引用并修改自 [OpenICL](https://github.com/Shark-NLP/OpenICL)。
该项目部分的数据集和提示词实现修改自 [chain-of-thought-hub](https://github.com/FranxYao/chain-of-thought-hub), [instruct-eval](https://github.com/declare-lab/instruct-eval)
## 引用
## 🖊️ 引用
```bibtex
@misc{2023opencompass,
@ -332,3 +338,5 @@ unzip OpenCompassData.zip
year={2023}
}
```
<p align="right"><a href="#top">🔝返回顶部</a></p>