[Docs] Add config docs. (#3)

* [Docs] Add config docs.

* Update according to comments
This commit is contained in:
Ma Zerun 2023-07-05 18:28:43 +08:00 committed by GitHub
parent 5840c7655c
commit ed76b5d066
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 332 additions and 0 deletions

View File

@ -1,2 +1,172 @@
# Learn About Config # Learn About Config
OpenCompass uses the OpenMMLab modern style configuration files. If you are familiar with the OpenMMLab style
configuration files, you can directly refer to
[A Pure Python style Configuration File (Beta)](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#a-pure-python-style-configuration-file-beta)
to understand the differences between the new-style and original configuration files. If you have not
encountered OpenMMLab style configuration files before, I will explain the usage of configuration files using
a simple example. Make sure you have installed the latest version of MMEngine (>=0.8.1) to support the
new-style configuration files.
## Basic Format
OpenCompass configuration files are in Python format, following basic Python syntax. Each configuration item
is specified by defining variables. For example, when defining a model, we use the following configuration:
```python
# model_cfg.py
from opencompass.models import HuggingFaceCausalLM
models = [
dict(
type=HuggingFaceCausalLM,
path='huggyllama/llama-7b',
model_kwargs=dict(device_map='auto'),
tokenizer_path='huggyllama/llama-65b',
tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
max_seq_len=2048,
max_out_len=50,
run_cfg=dict(num_gpus=8, num_procs=1),
)
]
```
When reading the configuration file, use `Config.fromfile` from MMEngine for parsing:
```python
>>> from mmengine.config import Config
>>> cfg = Config.fromfile('./model_cfg.py')
>>> print(cfg.models[0].type)
<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
```
## Inheritance Mechanism
OpenCompass configuration files use Python's import mechanism for file inheritance. Note that when inheriting
configuration files, we need to use the `read_base` context manager.
```python
# inherit.py
from mmengine.config import read_base
with read_base():
from .model_cfg import models # Inherits the 'models' from model_cfg.py
```
Parse the configuration file using `Config.fromfile`:
```python
>>> from mmengine.config import Config
>>> cfg = Config.fromfile('./inherit.py')
>>> print(cfg.models[0].type)
<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
```
## Evaluation Configuration Example
```python
# configs/llama7b.py
from mmengine.config import read_base
with read_base():
# Read the required dataset configurations directly from the preset dataset configurations
from .datasets.piqa.piqa_ppl import piqa_datasets
from .datasets.siqa.siqa_gen import siqa_datasets
# Concatenate the datasets to be evaluated into the datasets field
datasets = [*piqa_datasets, *siqa_datasets]
# Evaluate models supported by HuggingFace's `AutoModelForCausalLM` using `HuggingFaceCausalLM`
from opencompass.models import HuggingFaceCausalLM
models = [
dict(
type=HuggingFaceCausalLM,
# Initialization parameters for `HuggingFaceCausalLM`
path='huggyllama/llama-7b',
tokenizer_path='huggyllama/llama-7b',
tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
max_seq_len=2048,
# Common parameters for all models, not specific to HuggingFaceCausalLM's initialization parameters
abbr='llama-7b', # Model abbreviation for result display
max_out_len=100, # Maximum number of generated tokens
batch_size=16,
run_cfg=dict(num_gpus=1), # Run configuration for specifying resource requirements
)
]
```
## Dataset Configuration File Example
In the above example configuration file, we directly inherit the dataset-related configurations. Next, we will
use the PIQA dataset configuration file as an example to demonstrate the meanings of each field in the dataset
configuration file. If you do not intend to modify the prompt for model testing or add new datasets, you can
skip this section.
The PIQA dataset [configuration file](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) is as follows.
It is a configuration for evaluating based on perplexity (PPL) and does not use In-Context Learning.
```python
from opencompass.openicl.icl_prompt_template import PromptTemplate
from opencompass.openicl.icl_retriever import ZeroRetriever
from opencompass.openicl.icl_inferencer import PPLInferencer
from opencompass.openicl.icl_evaluator import AccEvaluator
from opencompass.datasets import HFDataset
# Reading configurations
# The loaded dataset is usually organized as dictionaries, specifying the input fields used to form the prompt
# and the output field used as the answer in each sample
piqa_reader_cfg = dict(
input_columns=['goal', 'sol1', 'sol2'],
output_column='label',
test_split='validation',
)
# Inference configurations
piqa_infer_cfg = dict(
# Prompt generation configuration
prompt_template=dict(
type=PromptTemplate,
# Prompt template, the template format matches the inferencer type specified later
# Here, to calculate PPL, we need to specify the prompt template for each answer
template={
0: 'The following makes sense: \nQ: {goal}\nA: {sol1}\n',
1: 'The following makes sense: \nQ: {goal}\nA: {sol2}\n'
}),
# In-Context example configuration, specifying `ZeroRetriever` here, which means not using in-context example.
retriever=dict(type=ZeroRetriever),
# Inference method configuration
# - PPLInferencer uses perplexity (PPL) to obtain answers
# - GenInferencer uses the model's generated results to obtain answers
inferencer=dict(type=PPLInferencer))
# Metric configuration, using Accuracy as the evaluation metric
piqa_eval_cfg = dict(evaluator=dict(type=AccEvaluator))
# Dataset configuration, where all the above variables are parameters for this configuration
# It is a list used to specify the configurations of different evaluation subsets of a dataset.
piqa_datasets = [
dict(
type=HFDataset,
path='piqa',
reader_cfg=piqa_reader_cfg,
infer_cfg=piqa_infer_cfg,
eval_cfg=piqa_eval_cfg)
```
For detailed configuration of the **Prompt generation configuration**, you can refer to the [Prompt Template](../prompt/prompt_template.md).
## Advanced Evaluation Configuration
In OpenCompass, we support configuration options such as task partitioner and runner for more flexible and
efficient utilization of computational resources.
By default, we use size-based partitioning for inference tasks. You can specify the sample number threshold
for task partitioning using `--max-partition-size` when starting the task. Additionally, we use local
resources for inference and evaluation tasks by default. If you want to use Slurm cluster resources, you can
use the `--slurm` parameter and the `--partition` parameter to specify the Slurm runner backend when starting
the task.
Furthermore, if the above functionalities do not meet your requirements for task partitioning and runner
backend configuration, you can provide more detailed configurations in the configuration file. Please refer to
[Efficient Evaluation](./evaluation.md) for more information.

View File

@ -1,2 +1,164 @@
# 学习配置文件 # 学习配置文件
OpenCompass 使用 OpenMMLab 新式风格的配置文件。如果你之前熟悉 OpenMMLab 风格的配置文件,可以直接阅读
[纯 Python 风格的配置文件Beta](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/config.html#python-beta)
了解新式配置文件与原配置文件的区别。如果你之前没有接触过 OpenMMLab 风格的配置文件,
下面我将会用一个简单的例子来介绍配置文件的使用。请确保你安装了最新版本的 MMEngine (>=0.8.1),以支持新式风格的配置文件。
## 基本格式
OpenCompass 的配置文件都是 Python 格式的,遵从基本的 Python 语法,通过定义变量的形式指定每个配置项。
比如在定义模型时,我们使用如下配置:
```python
# model_cfg.py
from opencompass.models import HuggingFaceCausalLM
models = [
dict(
type=HuggingFaceCausalLM,
path='huggyllama/llama-7b',
model_kwargs=dict(device_map='auto'),
tokenizer_path='huggyllama/llama-65b',
tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
max_seq_len=2048,
max_out_len=50,
run_cfg=dict(num_gpus=8, num_procs=1),
)
]
```
当读取配置文件时,使用 MMEngine 中的 `Config.fromfile` 进行解析:
```python
>>> from mmengine.config import Config
>>> cfg = Config.fromfile('./model_cfg.py')
>>> print(cfg.models[0].type)
<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
```
## 继承机制
OpenCompass 的配置文件使用了 Python 的 import 机制进行配置文件的继承。需要注意的是,
我们需要在继承配置文件时使用 `read_base` 上下文管理器。
```python
# inherit.py
from mmengine.config import read_base
with read_base():
from .model_cfg import models # model_cfg.py 中的 models 被继承到本配置文件
```
使用 `Config.fromfile` 解析配置文件:
```python
>>> from mmengine.config import Config
>>> cfg = Config.fromfile('./inherit.py')
>>> print(cfg.models[0].type)
<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
```
## 评测配置文件示例
```python
# configs/llama7b.py
from mmengine.config import read_base
with read_base():
# 直接从预设数据集配置中读取需要的数据集配置
from .datasets.piqa.piqa_ppl import piqa_datasets
from .datasets.siqa.siqa_gen import siqa_datasets
# 将需要评测的数据集拼接成 datasets 字段
datasets = [*piqa_datasets, *siqa_datasets]
# 使用 HuggingFaceCausalLM 评测 HuggingFace 中 AutoModelForCausalLM 支持的模型
from opencompass.models import HuggingFaceCausalLM
models = [
dict(
type=HuggingFaceCausalLM,
# 以下参数为 HuggingFaceCausalLM 的初始化参数
path='huggyllama/llama-7b',
tokenizer_path='huggyllama/llama-7b',
tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
max_seq_len=2048,
# 以下参数为各类模型都有的参数,非 HuggingFaceCausalLM 的初始化参数
abbr='llama-7b', # 模型简称,用于结果展示
max_out_len=100, # 最长生成 token 数
batch_size=16, # 批次大小
run_cfg=dict(num_gpus=1), # 运行配置,用于指定资源需求
)
]
```
## 数据集配置文件示例
以上示例配置文件中,我们直接以继承的方式获取了数据集相关的配置。接下来,
我们会以 PIQA 数据集配置文件为示例,展示如何数据集配置文件中各个字段的含义。
如果你不打算修改模型测试的 prompt或者添加新的数据集则可以跳过这一节的介绍。
PIQA 数据集 [配置文件](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py)
如下,这是一个基于 PPL困惑度进行评测的配置并且不使用上下文学习方法In-Context Learning
```python
from opencompass.openicl.icl_prompt_template import PromptTemplate
from opencompass.openicl.icl_retriever import ZeroRetriever
from opencompass.openicl.icl_inferencer import PPLInferencer
from opencompass.openicl.icl_evaluator import AccEvaluator
from opencompass.datasets import HFDataset
# 读取配置
# 加载后的数据集通常以字典形式组织样本,分别指定样本中用于组成 prompt 的输入字段,和作为答案的输出字段
piqa_reader_cfg = dict(
input_columns=['goal', 'sol1', 'sol2'],
output_column='label',
test_split='validation',
)
# 推理配置
piqa_infer_cfg = dict(
# Prompt 生成配置
prompt_template=dict(
type=PromptTemplate,
# Prompt 模板,模板形式与后续指定的 inferencer 类型相匹配
# 这里为了计算 PPL需要指定每个答案对应的 Prompt 模板
template={
0: 'The following makes sense: \nQ: {goal}\nA: {sol1}\n',
1: 'The following makes sense: \nQ: {goal}\nA: {sol2}\n'
}),
# 上下文样本配置,此处指定 `ZeroRetriever`,即不使用上下文样本
retriever=dict(type=ZeroRetriever),
# 推理方式配置
# - PPLInferencer 使用 PPL困惑度获取答案
# - GenInferencer 使用模型的生成结果获取答案
inferencer=dict(type=PPLInferencer))
# 评估配置,使用 Accuracy 作为评估指标
piqa_eval_cfg = dict(evaluator=dict(type=AccEvaluator))
# 数据集配置,以上各个变量均为此配置的参数
# 为一个列表,用于指定一个数据集各个评测子集的配置。
piqa_datasets = [
dict(
type=HFDataset,
path='piqa',
reader_cfg=piqa_reader_cfg,
infer_cfg=piqa_infer_cfg,
eval_cfg=piqa_eval_cfg)
```
其中 **Prompt 生成配置** 的详细配置方式,可以参见 [Prompt 模板](../prompt/prompt_template.md)。
## 进阶评测配置
在 OpenCompass 中我们支持了任务划分器Partitioner、运行后端Runner等配置项
用于更加灵活、高效的利用计算资源。
默认情况下,我们会使用基于样本数的方式对推理任务进行划分,你可以在启动任务时使用
`--max-partition-size` 指定进行任务划分的样本数阈值。同时,我们默认使用本地资源进行推理和评估任务,
如果你希望使用 Slurm 集群资源,可以在启动任务时使用 `--slurm` 参数和 `--partition` 参数指定 slurm 运行后端。
进一步地,如果以上功能无法满足你的任务划分和运行后端配置需求,你可以在配置文件中进行更详细的配置。
参见[高效评测](./evaluation.md)。