[Docs] Update dataset docs (#19)

* [Docs] Update dataset docs

* [Docs] Update dataset docs
This commit is contained in:
Tong Gao 2023-07-06 15:47:09 +08:00 committed by GitHub
parent d1025c3223
commit 30a988a620
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 28 additions and 15 deletions

View File

@ -37,7 +37,6 @@ OpenCompass is a one-stop platform for large model evaluation, aiming to provide
We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for community to rank all public models and API models. If you would like to join the evaluation, please provide the model repository URL or a standard API interface to the email address `opencompass@pjlab.org.cn`. We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for community to rank all public models and API models. If you would like to join the evaluation, please provide the model repository URL or a standard API interface to the email address `opencompass@pjlab.org.cn`.
[![image](https://github.com/InternLM/OpenCompass/assets/7881589/475b0c8e-28b8-43e9-b2fd-4dd558e22491)](https://opencompass.org.cn/rank) [![image](https://github.com/InternLM/OpenCompass/assets/7881589/475b0c8e-28b8-43e9-b2fd-4dd558e22491)](https://opencompass.org.cn/rank)
## Dataset Support ## Dataset Support
@ -289,7 +288,8 @@ git clone https://github.com/InternLM/opencompass opencompass
cd opencompass cd opencompass
pip install -e . pip install -e .
# Download dataset to data/ folder # Download dataset to data/ folder
# TODO: .... wget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip
unzip OpenCompassData.zip
``` ```
## Evaluation ## Evaluation

View File

@ -290,7 +290,8 @@ git clone https://github.com/InternLM/opencompass opencompass
cd opencompass cd opencompass
pip install -e . pip install -e .
# 下载数据集到 data/ 处 # 下载数据集到 data/ 处
# TODO: .... wget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip
unzip OpenCompassData.zip
``` ```
## 评测 ## 评测

View File

@ -60,7 +60,7 @@ Here's a detailed step-by-step explanation of this case study:
<details> <details>
<summary>prepare datasets</summary> <summary>prepare datasets</summary>
The SiQA and PiQA benchmarks can be automatically downloaded through their respective links here and here, so no manual downloading is required here. However, some other datasets may require manual downloads. Please refer to the documentation [Prepare Datasets](docs/zh_cn/user_guides/dataset_prepare.md) for more information. The SiQA and PiQA benchmarks can be automatically downloaded through their respective links here and here, so no manual downloading is required here. However, some other datasets may require manual downloads. Please refer to the documentation [Prepare Datasets](./user_guides/dataset_prepare.md) for more information.
Create a '.py' configuration file and add the following content: Create a '.py' configuration file and add the following content:

View File

@ -39,11 +39,17 @@ The datasets supported by OpenCompass mainly include two parts:
[Huggingface Dataset](https://huggingface.co/datasets) provides a large number of datasets. OpenCompass has supported most of the datasets commonly used for performance comparison, please refer to `configs/dataset` for the specific list of supported datasets. [Huggingface Dataset](https://huggingface.co/datasets) provides a large number of datasets. OpenCompass has supported most of the datasets commonly used for performance comparison, please refer to `configs/dataset` for the specific list of supported datasets.
2. OpenCompass Self-built Datasets 2. Third-party Datasets
In addition to supporting Huggingface's existing datasets, OpenCompass also provides some self-built CN datasets. In the future, a dataset-related link will be provided for users to download and use. Following the instructions in the document to place the datasets uniformly in the `./data` directory can complete dataset preparation. In addition to supporting Huggingface's existing datasets, OpenCompass also provides some third-party and self-built datasets. Run the following commands to download and place the datasets in the `./data` directory can complete dataset preparation.
It is important to note that the Repo not only contains self-built datasets, but also includes some HF-supported datasets for testing convenience. ```bash
# Run in the OpenCompass directory
wget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip
unzip OpenCompassData.zip
```
Note that the Repo not only contains self-built datasets, but also includes some HF-supported datasets for testing convenience.
## Dataset Selection ## Dataset Selection

View File

@ -55,7 +55,7 @@ python run.py configs/eval_llama_7b.py --debug
<details> <details>
<summary>准备数据集及其配置</summary> <summary>准备数据集及其配置</summary>
因为 [siqa](https://huggingface.co/datasets/siqa) [piqa](https://huggingface.co/datasets/piqa) 支持自动下载,所以这里不需要手动下载数据集,但有部分数据集可能需要手动下载,详细查看文档 [准备数据集](docs/zh_cn/user_guides/dataset_prepare.md). 因为 [siqa](https://huggingface.co/datasets/siqa) [piqa](https://huggingface.co/datasets/piqa) 支持自动下载,所以这里不需要手动下载数据集,但有部分数据集可能需要手动下载,详细查看文档 [准备数据集](./user_guides/dataset_prepare.md).
创建一个 '.py' 配置文件, 添加以下内容: 创建一个 '.py' 配置文件, 添加以下内容:

View File

@ -39,9 +39,15 @@ OpenCompass 支持的数据集主要包括两个部分:
[Huggingface Dataset](https://huggingface.co/datasets) 提供了大量的数据集。OpenCompass 已经支持了大多数常用于性能比较的数据集,具体支持的数据集列表请直接在 `configs/dataset` 下进行查找。 [Huggingface Dataset](https://huggingface.co/datasets) 提供了大量的数据集。OpenCompass 已经支持了大多数常用于性能比较的数据集,具体支持的数据集列表请直接在 `configs/dataset` 下进行查找。
2. OpenCompass 自建数据集 2. 第三方数据集
除了支持 Huggingface 已有的数据集, OpenCompass 还提供了一些自建CN数据集未来将会提供一个数据集相关的链接供用户下载使用。按照文档指示将数据集统一放置在`./data`目录下即可完成数据集准备。 除了支持 Huggingface 已有的数据集, OpenCompass 还提供了一些第三方数据集及自建CN数据集。运行以下命令将数据集统一下载并放置在`./data`目录下即可完成数据集准备。
```bash
# 在 OpenCompass 目录下运行
wget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip
unzip OpenCompassData.zip
```
需要注意的是Repo中不仅包含自建的数据集为了方便也加入了部分HF已支持的数据集方便测试。 需要注意的是Repo中不仅包含自建的数据集为了方便也加入了部分HF已支持的数据集方便测试。