[Docs] Add config docs. (#3)

* [Docs] Add config docs. * Update according to comments
2025-05-30 16:03:24 +08:00 · 2023-07-05 18:28:43 +08:00 · 2023-07-05 18:28:43 +08:00 · ed76b5d066
commit ed76b5d066
parent 5840c7655c
2 changed files with 332 additions and 0 deletions
--- a/docs/en/user_guides/config.md
+++ b/docs/en/user_guides/config.md
@ -1,2 +1,172 @@
 # Learn About Config

+OpenCompass uses the OpenMMLab modern style configuration files. If you are familiar with the OpenMMLab style
+configuration files, you can directly refer to 
+[A Pure Python style Configuration File (Beta)](https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#a-pure-python-style-configuration-file-beta)
+to understand the differences between the new-style and original configuration files. If you have not
+encountered OpenMMLab style configuration files before, I will explain the usage of configuration files using
+a simple example. Make sure you have installed the latest version of MMEngine (>=0.8.1) to support the
+new-style configuration files.
+
+## Basic Format
+
+OpenCompass configuration files are in Python format, following basic Python syntax. Each configuration item
+is specified by defining variables. For example, when defining a model, we use the following configuration:
+
+```python
+# model_cfg.py
+from opencompass.models import HuggingFaceCausalLM
+
+models = [
+    dict(
+        type=HuggingFaceCausalLM,
+        path='huggyllama/llama-7b',
+        model_kwargs=dict(device_map='auto'),
+        tokenizer_path='huggyllama/llama-65b',
+        tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
+        max_seq_len=2048,
+        max_out_len=50,
+        run_cfg=dict(num_gpus=8, num_procs=1),
+    )
+]
+```
+
+When reading the configuration file, use `Config.fromfile` from MMEngine for parsing:
+
+```python
+>>> from mmengine.config import Config
+>>> cfg = Config.fromfile('./model_cfg.py')
+>>> print(cfg.models[0].type)
+<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
+```
+
+## Inheritance Mechanism
+
+OpenCompass configuration files use Python's import mechanism for file inheritance. Note that when inheriting
+configuration files, we need to use the `read_base` context manager.
+
+```python
+# inherit.py
+from mmengine.config import read_base
+
+with read_base():
+    from .model_cfg import models  # Inherits the 'models' from model_cfg.py
+```
+
+Parse the configuration file using `Config.fromfile`:
+
+```python
+>>> from mmengine.config import Config
+>>> cfg = Config.fromfile('./inherit.py')
+>>> print(cfg.models[0].type)
+<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
+```
+
+## Evaluation Configuration Example
+
+```python
+# configs/llama7b.py
+from mmengine.config import read_base
+
+with read_base():
+    # Read the required dataset configurations directly from the preset dataset configurations
+    from .datasets.piqa.piqa_ppl import piqa_datasets
+    from .datasets.siqa.siqa_gen import siqa_datasets
+
+# Concatenate the datasets to be evaluated into the datasets field
+datasets = [*piqa_datasets, *siqa_datasets]
+
+# Evaluate models supported by HuggingFace's `AutoModelForCausalLM` using `HuggingFaceCausalLM`
+from opencompass.models import HuggingFaceCausalLM
+
+models = [
+    dict(
+        type=HuggingFaceCausalLM,
+        # Initialization parameters for `HuggingFaceCausalLM`
+        path='huggyllama/llama-7b',
+        tokenizer_path='huggyllama/llama-7b',
+        tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
+        max_seq_len=2048,
+        # Common parameters for all models, not specific to HuggingFaceCausalLM's initialization parameters
+        abbr='llama-7b',            # Model abbreviation for result display
+        max_out_len=100,            # Maximum number of generated tokens
+        batch_size=16,
+        run_cfg=dict(num_gpus=1),   # Run configuration for specifying resource requirements
+    )
+]
+```
+
+## Dataset Configuration File Example
+
+In the above example configuration file, we directly inherit the dataset-related configurations. Next, we will
+use the PIQA dataset configuration file as an example to demonstrate the meanings of each field in the dataset
+configuration file. If you do not intend to modify the prompt for model testing or add new datasets, you can
+skip this section.
+
+The PIQA dataset [configuration file](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) is as follows.
+It is a configuration for evaluating based on perplexity (PPL) and does not use In-Context Learning.
+
+```python
+from opencompass.openicl.icl_prompt_template import PromptTemplate
+from opencompass.openicl.icl_retriever import ZeroRetriever
+from opencompass.openicl.icl_inferencer import PPLInferencer
+from opencompass.openicl.icl_evaluator import AccEvaluator
+from opencompass.datasets import HFDataset
+
+# Reading configurations
+# The loaded dataset is usually organized as dictionaries, specifying the input fields used to form the prompt
+# and the output field used as the answer in each sample
+piqa_reader_cfg = dict(
+    input_columns=['goal', 'sol1', 'sol2'],
+    output_column='label',
+    test_split='validation',
+)
+
+# Inference configurations
+piqa_infer_cfg = dict(
+    # Prompt generation configuration
+    prompt_template=dict(
+        type=PromptTemplate,
+        # Prompt template, the template format matches the inferencer type specified later
+        # Here, to calculate PPL, we need to specify the prompt template for each answer
+        template={
+            0: 'The following makes sense: \nQ: {goal}\nA: {sol1}\n',
+            1: 'The following makes sense: \nQ: {goal}\nA: {sol2}\n'
+        }),
+    # In-Context example configuration, specifying `ZeroRetriever` here, which means not using in-context example.
+    retriever=dict(type=ZeroRetriever),
+    # Inference method configuration
+    #   - PPLInferencer uses perplexity (PPL) to obtain answers
+    #   - GenInferencer uses the model's generated results to obtain answers
+    inferencer=dict(type=PPLInferencer))
+
+# Metric configuration, using Accuracy as the evaluation metric
+piqa_eval_cfg = dict(evaluator=dict(type=AccEvaluator))
+
+# Dataset configuration, where all the above variables are parameters for this configuration
+# It is a list used to specify the configurations of different evaluation subsets of a dataset.
+piqa_datasets = [
+    dict(
+        type=HFDataset,
+        path='piqa',
+        reader_cfg=piqa_reader_cfg,
+        infer_cfg=piqa_infer_cfg,
+        eval_cfg=piqa_eval_cfg)
+```
+
+For detailed configuration of the **Prompt generation configuration**, you can refer to the [Prompt Template](../prompt/prompt_template.md).
+
+## Advanced Evaluation Configuration
+
+In OpenCompass, we support configuration options such as task partitioner and runner for more flexible and
+efficient utilization of computational resources.
+
+By default, we use size-based partitioning for inference tasks. You can specify the sample number threshold
+for task partitioning using `--max-partition-size` when starting the task. Additionally, we use local
+resources for inference and evaluation tasks by default. If you want to use Slurm cluster resources, you can
+use the `--slurm` parameter and the `--partition` parameter to specify the Slurm runner backend when starting
+the task.
+
+Furthermore, if the above functionalities do not meet your requirements for task partitioning and runner
+backend configuration, you can provide more detailed configurations in the configuration file. Please refer to
+[Efficient Evaluation](./evaluation.md) for more information.
--- a/docs/zh_cn/user_guides/config.md
+++ b/docs/zh_cn/user_guides/config.md
@ -1,2 +1,164 @@
 # 学习配置文件

+OpenCompass 使用 OpenMMLab 新式风格的配置文件。如果你之前熟悉 OpenMMLab 风格的配置文件，可以直接阅读
+[纯 Python 风格的配置文件（Beta）](https://mmengine.readthedocs.io/zh_CN/latest/advanced_tutorials/config.html#python-beta)
+了解新式配置文件与原配置文件的区别。如果你之前没有接触过 OpenMMLab 风格的配置文件，
+下面我将会用一个简单的例子来介绍配置文件的使用。请确保你安装了最新版本的 MMEngine (>=0.8.1)，以支持新式风格的配置文件。
+
+## 基本格式
+
+OpenCompass 的配置文件都是 Python 格式的，遵从基本的 Python 语法，通过定义变量的形式指定每个配置项。
+比如在定义模型时，我们使用如下配置：
+
+```python
+# model_cfg.py
+from opencompass.models import HuggingFaceCausalLM
+
+models = [
+    dict(
+        type=HuggingFaceCausalLM,
+        path='huggyllama/llama-7b',
+        model_kwargs=dict(device_map='auto'),
+        tokenizer_path='huggyllama/llama-65b',
+        tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
+        max_seq_len=2048,
+        max_out_len=50,
+        run_cfg=dict(num_gpus=8, num_procs=1),
+    )
+]
+```
+
+当读取配置文件时，使用 MMEngine 中的 `Config.fromfile` 进行解析：
+
+```python
+>>> from mmengine.config import Config
+>>> cfg = Config.fromfile('./model_cfg.py')
+>>> print(cfg.models[0].type)
+<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
+```
+
+## 继承机制
+
+OpenCompass 的配置文件使用了 Python 的 import 机制进行配置文件的继承。需要注意的是，
+我们需要在继承配置文件时使用 `read_base` 上下文管理器。
+
+```python
+# inherit.py
+from mmengine.config import read_base
+
+with read_base():
+    from .model_cfg import models  # model_cfg.py 中的 models 被继承到本配置文件
+```
+
+使用 `Config.fromfile` 解析配置文件：
+
+```python
+>>> from mmengine.config import Config
+>>> cfg = Config.fromfile('./inherit.py')
+>>> print(cfg.models[0].type)
+<class 'opencompass.models.huggingface.HuggingFaceCausalLM'>
+```
+
+## 评测配置文件示例
+
+```python
+# configs/llama7b.py
+from mmengine.config import read_base
+
+with read_base():
+    # 直接从预设数据集配置中读取需要的数据集配置
+    from .datasets.piqa.piqa_ppl import piqa_datasets
+    from .datasets.siqa.siqa_gen import siqa_datasets
+
+# 将需要评测的数据集拼接成 datasets 字段
+datasets = [*piqa_datasets, *siqa_datasets]
+
+# 使用 HuggingFaceCausalLM 评测 HuggingFace 中 AutoModelForCausalLM 支持的模型
+from opencompass.models import HuggingFaceCausalLM
+
+models = [
+    dict(
+        type=HuggingFaceCausalLM,
+        # 以下参数为 HuggingFaceCausalLM 的初始化参数
+        path='huggyllama/llama-7b',
+        tokenizer_path='huggyllama/llama-7b',
+        tokenizer_kwargs=dict(padding_side='left', truncation_side='left'),
+        max_seq_len=2048,
+        # 以下参数为各类模型都有的参数，非 HuggingFaceCausalLM 的初始化参数
+        abbr='llama-7b',            # 模型简称，用于结果展示
+        max_out_len=100,            # 最长生成 token 数
+        batch_size=16,              # 批次大小
+        run_cfg=dict(num_gpus=1),   # 运行配置，用于指定资源需求
+    )
+]
+```
+
+## 数据集配置文件示例
+
+以上示例配置文件中，我们直接以继承的方式获取了数据集相关的配置。接下来，
+我们会以 PIQA 数据集配置文件为示例，展示如何数据集配置文件中各个字段的含义。
+如果你不打算修改模型测试的 prompt，或者添加新的数据集，则可以跳过这一节的介绍。
+
+PIQA 数据集 [配置文件](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py)
+如下，这是一个基于 PPL（困惑度）进行评测的配置，并且不使用上下文学习方法（In-Context Learning）。
+
+```python
+from opencompass.openicl.icl_prompt_template import PromptTemplate
+from opencompass.openicl.icl_retriever import ZeroRetriever
+from opencompass.openicl.icl_inferencer import PPLInferencer
+from opencompass.openicl.icl_evaluator import AccEvaluator
+from opencompass.datasets import HFDataset
+
+# 读取配置
+# 加载后的数据集通常以字典形式组织样本，分别指定样本中用于组成 prompt 的输入字段，和作为答案的输出字段
+piqa_reader_cfg = dict(
+    input_columns=['goal', 'sol1', 'sol2'],
+    output_column='label',
+    test_split='validation',
+)
+
+# 推理配置
+piqa_infer_cfg = dict(
+    # Prompt 生成配置
+    prompt_template=dict(
+        type=PromptTemplate,
+        # Prompt 模板，模板形式与后续指定的 inferencer 类型相匹配
+        # 这里为了计算 PPL，需要指定每个答案对应的 Prompt 模板
+        template={
+            0: 'The following makes sense: \nQ: {goal}\nA: {sol1}\n',
+            1: 'The following makes sense: \nQ: {goal}\nA: {sol2}\n'
+        }),
+    # 上下文样本配置，此处指定 `ZeroRetriever`，即不使用上下文样本
+    retriever=dict(type=ZeroRetriever),
+    # 推理方式配置
+    #   - PPLInferencer 使用 PPL（困惑度）获取答案
+    #   - GenInferencer 使用模型的生成结果获取答案
+    inferencer=dict(type=PPLInferencer))
+
+# 评估配置，使用 Accuracy 作为评估指标
+piqa_eval_cfg = dict(evaluator=dict(type=AccEvaluator))
+
+# 数据集配置，以上各个变量均为此配置的参数
+# 为一个列表，用于指定一个数据集各个评测子集的配置。
+piqa_datasets = [
+    dict(
+        type=HFDataset,
+        path='piqa',
+        reader_cfg=piqa_reader_cfg,
+        infer_cfg=piqa_infer_cfg,
+        eval_cfg=piqa_eval_cfg)
+```
+
+其中 **Prompt 生成配置** 的详细配置方式，可以参见 [Prompt 模板](../prompt/prompt_template.md)。
+
+## 进阶评测配置
+
+在 OpenCompass 中，我们支持了任务划分器（Partitioner）、运行后端（Runner）等配置项，
+用于更加灵活、高效的利用计算资源。
+
+默认情况下，我们会使用基于样本数的方式对推理任务进行划分，你可以在启动任务时使用
+`--max-partition-size` 指定进行任务划分的样本数阈值。同时，我们默认使用本地资源进行推理和评估任务，
+如果你希望使用 Slurm 集群资源，可以在启动任务时使用 `--slurm` 参数和 `--partition` 参数指定 slurm 运行后端。
+
+进一步地，如果以上功能无法满足你的任务划分和运行后端配置需求，你可以在配置文件中进行更详细的配置。
+参见[高效评测](./evaluation.md)。