From 3871188c89e7841e54f40363a5bb7dc62afa2510 Mon Sep 17 00:00:00 2001 From: Songyang Zhang Date: Thu, 7 Sep 2023 17:29:50 +0800 Subject: [PATCH] [Feat] Update URL (#368) --- .github/ISSUE_TEMPLATE/1_bug-report.yml | 6 +++--- .github/ISSUE_TEMPLATE/2_feature-request.yml | 2 +- .github/ISSUE_TEMPLATE/3_bug-report_zh.yml | 6 +++--- .github/ISSUE_TEMPLATE/4_feature-request_zh.yml | 2 +- .github/ISSUE_TEMPLATE/config.yml | 2 +- README.md | 12 ++++++------ README_zh-CN.md | 12 ++++++------ docs/en/advanced_guides/code_eval_service.md | 4 ++-- docs/en/conf.py | 2 +- docs/en/get_started.md | 12 ++++++------ docs/en/notes/contribution_guide.md | 4 ++-- docs/en/prompt/chain_of_thought.md | 4 ++-- docs/en/user_guides/config.md | 2 +- docs/en/user_guides/metrics.md | 4 ++-- docs/zh_cn/advanced_guides/code_eval_service.md | 4 ++-- docs/zh_cn/conf.py | 2 +- docs/zh_cn/get_started.md | 10 +++++----- docs/zh_cn/notes/contribution_guide.md | 4 ++-- docs/zh_cn/prompt/chain_of_thought.md | 4 ++-- docs/zh_cn/user_guides/config.md | 2 +- docs/zh_cn/user_guides/metrics.md | 4 ++-- 21 files changed, 52 insertions(+), 52 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/1_bug-report.yml b/.github/ISSUE_TEMPLATE/1_bug-report.yml index 6d4f3ec1..d6161f0b 100644 --- a/.github/ISSUE_TEMPLATE/1_bug-report.yml +++ b/.github/ISSUE_TEMPLATE/1_bug-report.yml @@ -6,7 +6,7 @@ body: - type: markdown attributes: value: | - For general questions or idea discussions, please post it to our [**Forum**](https://github.com/InternLM/opencompass/discussions). + For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions). If you have already identified the reason, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)! If you need our help, please fill in the following form to help us to identify the bug. @@ -15,9 +15,9 @@ body: label: Prerequisite description: Please check the following items before creating a new issue. options: - - label: I have searched [Issues](https://github.com/InternLM/opencompass/issues/) and [Discussions](https://github.com/InternLM/opencompass/discussions) but cannot get the expected help. + - label: I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help. required: true - - label: The bug has not been fixed in the [latest version](https://github.com/InternLM/opencompass). + - label: The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass). required: true - type: dropdown diff --git a/.github/ISSUE_TEMPLATE/2_feature-request.yml b/.github/ISSUE_TEMPLATE/2_feature-request.yml index b3fe66a9..bbaffefb 100644 --- a/.github/ISSUE_TEMPLATE/2_feature-request.yml +++ b/.github/ISSUE_TEMPLATE/2_feature-request.yml @@ -6,7 +6,7 @@ body: - type: markdown attributes: value: | - For general questions or idea discussions, please post it to our [**Forum**](https://github.com/InternLM/opencompass/discussions). + For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions). If you have already implemented the feature, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)! - type: textarea diff --git a/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml b/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml index 2a4ad75e..e7e367d9 100644 --- a/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml +++ b/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml @@ -7,7 +7,7 @@ body: attributes: value: | 我们推荐使用英语模板 Bug report,以便你的问题帮助更多人。 - 如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/InternLM/opencompass/discussions)讨论。 + 如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。 如果你已经有了解决方案,我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。 如果你需要我们的帮助,请填写以下内容帮助我们定位 Bug。 @@ -16,9 +16,9 @@ body: label: 先决条件 description: 在创建新问题之前,请检查以下项目。 options: - - label: 我已经搜索过 [问题](https://github.com/InternLM/opencompass/issues/) 和 [讨论](https://github.com/InternLM/opencompass/discussions) 但未得到预期的帮助。 + - label: 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。 required: true - - label: 错误在 [最新版本](https://github.com/InternLM/opencompass) 中尚未被修复。 + - label: 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。 required: true - type: dropdown diff --git a/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml b/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml index a1a18ad7..1fee37d5 100644 --- a/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml +++ b/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml @@ -7,7 +7,7 @@ body: attributes: value: | 推荐使用英语模板 Feature request,以便你的问题帮助更多人。 - 如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/InternLM/opencompass/discussions)讨论。 + 如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。 如果你已经实现了该功能,我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。 - type: textarea diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 39c9c56f..f3811e30 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -5,7 +5,7 @@ contact_links: url: https://opencompass.readthedocs.io/en/latest/ about: Check if your question is answered in docs - name: 💬 General questions (寻求帮助) - url: https://github.com/InternLM/OpenCompass/discussions + url: https://github.com/open-compass/opencompass/discussions about: Ask general usage questions and discuss with other OpenCompass community members - name: 🌐 Explore OpenCompass (官网) url: https://opencompass.org.cn/ diff --git a/README.md b/README.md index dd23d343..0c373287 100644 --- a/README.md +++ b/README.md @@ -4,14 +4,14 @@
[![docs](https://readthedocs.org/projects/opencompass/badge)](https://opencompass.readthedocs.io/en) -[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](https://github.com/InternLM/opencompass/blob/main/LICENSE) +[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](https://github.com/open-compass/opencompass/blob/main/LICENSE) [🌐Website](https://opencompass.org.cn/) | [📘Documentation](https://opencompass.readthedocs.io/en/latest/) | [🛠️Installation](https://opencompass.readthedocs.io/en/latest/get_started.html#installation) | -[🤔Reporting Issues](https://github.com/InternLM/opencompass/issues/new/choose) +[🤔Reporting Issues](https://github.com/open-compass/opencompass/issues/new/choose) English | [简体中文](README_zh-CN.md) @@ -29,7 +29,7 @@ Just like a compass guides us on our journey, OpenCompass will guide you through > **🔥 Attention**
> We launch the OpenCompass Collabration project, welcome to support diverse evaluation benchmarks into OpenCompass! -> Clike [Issue](https://github.com/InternLM/opencompass/issues/248) for more information. +> Clike [Issue](https://github.com/open-compass/opencompass/issues/248) for more information. > Let's work together to build a more powerful OpenCompass toolkit! ## 🚀 What's New @@ -311,11 +311,11 @@ Below are the steps for quick installation and datasets preparation. ```Python conda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y conda activate opencompass -git clone https://github.com/InternLM/opencompass opencompass +git clone https://github.com/open-compass/opencompass opencompass cd opencompass pip install -e . # Download dataset to data/ folder -wget https://github.com/InternLM/opencompass/releases/download/0.1.1/OpenCompassData.zip +wget https://github.com/open-compass/opencompass/releases/download/0.1.1/OpenCompassData.zip unzip OpenCompassData.zip ``` @@ -389,7 +389,7 @@ Some datasets and prompt implementations are modified from [chain-of-thought-hub @misc{2023opencompass, title={OpenCompass: A Universal Evaluation Platform for Foundation Models}, author={OpenCompass Contributors}, - howpublished = {\url{https://github.com/InternLM/OpenCompass}}, + howpublished = {\url{https://github.com/open-compass/opencompass}}, year={2023} } ``` diff --git a/README_zh-CN.md b/README_zh-CN.md index c9fe565d..b1c62015 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -4,14 +4,14 @@
[![docs](https://readthedocs.org/projects/opencompass/badge)](https://opencompass.readthedocs.io/zh_CN) -[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](https://github.com/InternLM/opencompass/blob/main/LICENSE) +[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](https://github.com/open-compass/opencompass/blob/main/LICENSE) [🌐Website](https://opencompass.org.cn/) | [📘Documentation](https://opencompass.readthedocs.io/zh_CN/latest/index.html) | [🛠️Installation](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id1) | -[🤔Reporting Issues](https://github.com/InternLM/opencompass/issues/new/choose) +[🤔Reporting Issues](https://github.com/open-compass/opencompass/issues/new/choose) [English](/README.md) | 简体中文 @@ -29,7 +29,7 @@ > **🔥 注意**
> 我们正式启动 OpenCompass 共建计划,诚邀社区用户为 OpenCompass 提供更具代表性和可信度的客观评测数据集! -> 点击 [Issue](https://github.com/InternLM/opencompass/issues/248) 获取更多数据集. +> 点击 [Issue](https://github.com/open-compass/opencompass/issues/248) 获取更多数据集. > 让我们携手共进,打造功能强大易用的大模型评测平台! ## 🚀 最新进展 @@ -312,11 +312,11 @@ OpenCompass 是面向大模型评测的一站式平台。其主要特点如下 ```Python conda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y conda activate opencompass -git clone https://github.com/InternLM/opencompass opencompass +git clone https://github.com/open-compass/opencompass opencompass cd opencompass pip install -e . # 下载数据集到 data/ 处 -wget https://github.com/InternLM/opencompass/releases/download/0.1.1/OpenCompassData.zip +wget https://github.com/open-compass/opencompass/releases/download/0.1.1/OpenCompassData.zip unzip OpenCompassData.zip ``` @@ -392,7 +392,7 @@ python run.py --datasets ceval_ppl mmlu_ppl \ @misc{2023opencompass, title={OpenCompass: A Universal Evaluation Platform for Foundation Models}, author={OpenCompass Contributors}, - howpublished = {\url{https://github.com/InternLM/OpenCompass}}, + howpublished = {\url{https://github.com/open-compass/opencompass}}, year={2023} } ``` diff --git a/docs/en/advanced_guides/code_eval_service.md b/docs/en/advanced_guides/code_eval_service.md index e995cb2e..0a90a07d 100644 --- a/docs/en/advanced_guides/code_eval_service.md +++ b/docs/en/advanced_guides/code_eval_service.md @@ -39,9 +39,9 @@ When the model inference and code evaluation services are running on the same ho ### Configuration File -We provide [the configuration file](https://github.com/InternLM/opencompass/blob/main/configs/eval_codegeex2.py) of using `humanevalx` for evaluation on `codegeex2` as reference. +We provide [the configuration file](https://github.com/open-compass/opencompass/blob/main/configs/eval_codegeex2.py) of using `humanevalx` for evaluation on `codegeex2` as reference. -The dataset and related post-processing configurations files can be found at this [link](https://github.com/InternLM/opencompass/tree/main/configs/datasets/humanevalx) with attention paid to the `evaluator` field in the humanevalx_eval_cfg_dict. +The dataset and related post-processing configurations files can be found at this [link](https://github.com/open-compass/opencompass/tree/main/configs/datasets/humanevalx) with attention paid to the `evaluator` field in the humanevalx_eval_cfg_dict. ```python from opencompass.openicl.icl_prompt_template import PromptTemplate diff --git a/docs/en/conf.py b/docs/en/conf.py index 3c99b3c4..141fc013 100644 --- a/docs/en/conf.py +++ b/docs/en/conf.py @@ -93,7 +93,7 @@ html_theme_options = { 'menu': [ { 'name': 'GitHub', - 'url': 'https://github.com/InternLM/opencompass' + 'url': 'https://github.com/open-compass/opencompass' }, ], # Specify the language of shared menu diff --git a/docs/en/get_started.md b/docs/en/get_started.md index f92afc0a..11b8ddfe 100644 --- a/docs/en/get_started.md +++ b/docs/en/get_started.md @@ -12,7 +12,7 @@ 2. Install OpenCompass: ```bash - git clone https://github.com/InternLM/opencompass.git + git clone https://github.com/open-compass/opencompass.git cd opencompass pip install -e . ``` @@ -51,7 +51,7 @@ cd .. ``` - You can find example configs in `configs/models`. ([example](https://github.com/InternLM/opencompass/blob/eb4822a94d624a4e16db03adeb7a59bbd10c2012/configs/models/llama2_7b_chat.py)) + You can find example configs in `configs/models`. ([example](https://github.com/open-compass/opencompass/blob/eb4822a94d624a4e16db03adeb7a59bbd10c2012/configs/models/llama2_7b_chat.py)) @@ -66,7 +66,7 @@ Run the following commands to download and place the datasets in the `${OpenComp ```bash # Run in the OpenCompass directory -wget https://github.com/InternLM/opencompass/releases/download/0.1.1/OpenCompassData.zip +wget https://github.com/open-compass/opencompass/releases/download/0.1.1/OpenCompassData.zip unzip OpenCompassData.zip ``` @@ -74,10 +74,10 @@ OpenCompass has supported most of the datasets commonly used for performance com # Quick Start -We will demonstrate some basic features of OpenCompass through evaluating pretrained models [OPT-125M](https://huggingface.co/facebook/opt-125m) and [OPT-350M](https://huggingface.co/facebook/opt-350m) on both [SIQA](https://huggingface.co/datasets/social_i_qa) and [Winograd](https://huggingface.co/datasets/winogrande) benchmark tasks with their config file located at [configs/eval_demo.py](https://github.com/InternLM/opencompass/blob/main/configs/eval_demo.py). +We will demonstrate some basic features of OpenCompass through evaluating pretrained models [OPT-125M](https://huggingface.co/facebook/opt-125m) and [OPT-350M](https://huggingface.co/facebook/opt-350m) on both [SIQA](https://huggingface.co/datasets/social_i_qa) and [Winograd](https://huggingface.co/datasets/winogrande) benchmark tasks with their config file located at [configs/eval_demo.py](https://github.com/open-compass/opencompass/blob/main/configs/eval_demo.py). Before running this experiment, please make sure you have installed OpenCompass locally and it should run successfully under one _GTX-1660-6G_ GPU. -For larger parameterized models like Llama-7B, refer to other examples provided in the [configs directory](https://github.com/InternLM/opencompass/tree/main/configs). +For larger parameterized models like Llama-7B, refer to other examples provided in the [configs directory](https://github.com/open-compass/opencompass/tree/main/configs). ## Configure an Evaluation Task @@ -270,7 +270,7 @@ datasets = [*siqa_datasets, *winograd_datasets] # The final config needs t Dataset configurations are typically of two types: 'ppl' and 'gen', indicating the evaluation method used. Where `ppl` means discriminative evaluation and `gen` means generative evaluation. -Moreover, [configs/datasets/collections](https://github.com/InternLM/OpenCompass/blob/main/configs/datasets/collections) houses various dataset collections, making it convenient for comprehensive evaluations. OpenCompass often uses [`base_medium.py`](/configs/datasets/collections/base_medium.py) for full-scale model testing. To replicate results, simply import that file, for example: +Moreover, [configs/datasets/collections](https://github.com/open-compass/opencompass/blob/main/configs/datasets/collections) houses various dataset collections, making it convenient for comprehensive evaluations. OpenCompass often uses [`base_medium.py`](/configs/datasets/collections/base_medium.py) for full-scale model testing. To replicate results, simply import that file, for example: ```bash python run.py --models hf_llama_7b --datasets base_medium diff --git a/docs/en/notes/contribution_guide.md b/docs/en/notes/contribution_guide.md index 0787894b..5a066a78 100644 --- a/docs/en/notes/contribution_guide.md +++ b/docs/en/notes/contribution_guide.md @@ -43,7 +43,7 @@ Pull requests let you tell others about changes you have pushed to a branch in a - When you work on your first PR Fork the OpenCompass repository: click the **fork** button at the top right corner of Github page - ![avatar](https://github.com/InternLM/opencompass/assets/22607038/851ed33d-02db-49c9-bf94-7c62eee89eb2) + ![avatar](https://github.com/open-compass/opencompass/assets/22607038/851ed33d-02db-49c9-bf94-7c62eee89eb2) Clone forked repository to local @@ -102,7 +102,7 @@ git checkout main -b branchname ``` - Create a PR - ![avatar](https://github.com/InternLM/opencompass/assets/22607038/08feb221-b145-4ea8-8e20-05f143081604) + ![avatar](https://github.com/open-compass/opencompass/assets/22607038/08feb221-b145-4ea8-8e20-05f143081604) - Revise PR message template to describe your motivation and modifications made in this PR. You can also link the related issue to the PR manually in the PR message (For more information, checkout the [official guidance](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)). diff --git a/docs/en/prompt/chain_of_thought.md b/docs/en/prompt/chain_of_thought.md index 764308f6..09a856f7 100644 --- a/docs/en/prompt/chain_of_thought.md +++ b/docs/en/prompt/chain_of_thought.md @@ -4,7 +4,7 @@ During the process of reasoning, CoT (Chain of Thought) method is an efficient way to help LLMs deal complex questions, for example: math problem and relation inference. In OpenCompass, we support multiple types of CoT method. -![image](https://github.com/InternLM/opencompass/assets/28834990/45d60e0e-02a1-49aa-b792-40a1f95f9b9e) +![image](https://github.com/open-compass/opencompass/assets/28834990/45d60e0e-02a1-49aa-b792-40a1f95f9b9e) ## 1. Zero Shot CoT @@ -72,7 +72,7 @@ OpenCompass defaults to use argmax for sampling the next token. Therefore, if th Where `SAMPLE_SIZE` is the number of reasoning paths in Self-Consistency, higher value usually outcome higher performance. The following figure from the original SC paper demonstrates the relation between reasoning paths and performance in several reasoning tasks: -![image](https://github.com/InternLM/opencompass/assets/28834990/05c7d850-7076-43ca-b165-e6251f9b3001) +![image](https://github.com/open-compass/opencompass/assets/28834990/05c7d850-7076-43ca-b165-e6251f9b3001) From the figure, it can be seen that in different reasoning tasks, performance tends to improve as the number of reasoning paths increases. However, for some tasks, increasing the number of reasoning paths may reach a limit, and further increasing the number of paths may not bring significant performance improvement. Therefore, it is necessary to conduct experiments and adjustments on specific tasks to find the optimal number of reasoning paths that best suit the task. diff --git a/docs/en/user_guides/config.md b/docs/en/user_guides/config.md index 2ccb5f49..c74d8db9 100644 --- a/docs/en/user_guides/config.md +++ b/docs/en/user_guides/config.md @@ -103,7 +103,7 @@ use the PIQA dataset configuration file as an example to demonstrate the meaning configuration file. If you do not intend to modify the prompt for model testing or add new datasets, you can skip this section. -The PIQA dataset [configuration file](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) is as follows. +The PIQA dataset [configuration file](https://github.com/open-compass/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) is as follows. It is a configuration for evaluating based on perplexity (PPL) and does not use In-Context Learning. ```python diff --git a/docs/en/user_guides/metrics.md b/docs/en/user_guides/metrics.md index 3fe85b2d..b15d7dd2 100644 --- a/docs/en/user_guides/metrics.md +++ b/docs/en/user_guides/metrics.md @@ -12,7 +12,7 @@ There is also a type of **scoring-type** evaluation task without standard answer ## Supported Evaluation Metrics -Currently, in OpenCompass, commonly used Evaluators are mainly located in the [`opencompass/openicl/icl_evaluator`](https://github.com/InternLM/opencompass/tree/main/opencompass/openicl/icl_evaluator) folder. There are also some dataset-specific indicators that are placed in parts of [`opencompass/datasets`](https://github.com/InternLM/opencompass/tree/main/opencompass/datasets). Below is a summary: +Currently, in OpenCompass, commonly used Evaluators are mainly located in the [`opencompass/openicl/icl_evaluator`](https://github.com/open-compass/opencompass/tree/main/opencompass/openicl/icl_evaluator) folder. There are also some dataset-specific indicators that are placed in parts of [`opencompass/datasets`](https://github.com/open-compass/opencompass/tree/main/opencompass/datasets). Below is a summary: | Evaluation Strategy | Evaluation Metrics | Common Postprocessing Method | Datasets | | ------------------- | -------------------- | ---------------------------- | -------------------------------------------------------------------- | @@ -33,7 +33,7 @@ Currently, in OpenCompass, commonly used Evaluators are mainly located in the [` The evaluation standard configuration is generally placed in the dataset configuration file, and the final xxdataset_eval_cfg will be passed to `dataset.infer_cfg` as an instantiation parameter. -Below is the definition of `govrepcrs_eval_cfg`, and you can refer to [configs/datasets/govrepcrs](https://github.com/InternLM/opencompass/tree/main/configs/datasets/govrepcrs). +Below is the definition of `govrepcrs_eval_cfg`, and you can refer to [configs/datasets/govrepcrs](https://github.com/open-compass/opencompass/tree/main/configs/datasets/govrepcrs). ```python from opencompass.openicl.icl_evaluator import BleuEvaluator diff --git a/docs/zh_cn/advanced_guides/code_eval_service.md b/docs/zh_cn/advanced_guides/code_eval_service.md index 4422c1bc..0212c795 100644 --- a/docs/zh_cn/advanced_guides/code_eval_service.md +++ b/docs/zh_cn/advanced_guides/code_eval_service.md @@ -39,8 +39,8 @@ telnet your_service_ip_address your_service_port ### 配置文件 -我们已经提供了 huamaneval-x 在 codegeex2 上评估的\[配置文件\]作为参考(https://github.com/InternLM/opencompass/blob/main/configs/eval_codegeex2.py)。 -其中数据集以及相关后处理的配置文件为这个[链接](https://github.com/InternLM/opencompass/tree/main/configs/datasets/humanevalx), 需要注意 humanevalx_eval_cfg_dict 中的evaluator 字段。 +我们已经提供了 huamaneval-x 在 codegeex2 上评估的\[配置文件\]作为参考(https://github.com/open-compass/opencompass/blob/main/configs/eval_codegeex2.py)。 +其中数据集以及相关后处理的配置文件为这个[链接](https://github.com/open-compass/opencompass/tree/main/configs/datasets/humanevalx), 需要注意 humanevalx_eval_cfg_dict 中的evaluator 字段。 ```python from opencompass.openicl.icl_prompt_template import PromptTemplate diff --git a/docs/zh_cn/conf.py b/docs/zh_cn/conf.py index 41d45e6c..f19f0daa 100644 --- a/docs/zh_cn/conf.py +++ b/docs/zh_cn/conf.py @@ -93,7 +93,7 @@ html_theme_options = { 'menu': [ { 'name': 'GitHub', - 'url': 'https://github.com/InternLM/opencompass' + 'url': 'https://github.com/open-compass/opencompass' }, ], # Specify the language of shared menu diff --git a/docs/zh_cn/get_started.md b/docs/zh_cn/get_started.md index 41ccd16b..a0ca8b0a 100644 --- a/docs/zh_cn/get_started.md +++ b/docs/zh_cn/get_started.md @@ -12,7 +12,7 @@ 2. 安装 OpenCompass: ```bash - git clone https://github.com/InternLM/opencompass.git + git clone https://github.com/open-compass/opencompass.git cd opencompass pip install -e . ``` @@ -51,7 +51,7 @@ cd .. ``` - 你可以在 `configs/models` 下找到所有 Llama / Llama-2 / Llama-2-chat 模型的配置文件示例。([示例](https://github.com/InternLM/opencompass/blob/eb4822a94d624a4e16db03adeb7a59bbd10c2012/configs/models/llama2_7b_chat.py)) + 你可以在 `configs/models` 下找到所有 Llama / Llama-2 / Llama-2-chat 模型的配置文件示例。([示例](https://github.com/open-compass/opencompass/blob/eb4822a94d624a4e16db03adeb7a59bbd10c2012/configs/models/llama2_7b_chat.py)) @@ -66,7 +66,7 @@ OpenCompass 支持的数据集主要包括两个部分: 在 OpenCompass 项目根目录下运行下面命令,将数据集准备至 `${OpenCompass}/data` 目录下: ```bash -wget https://github.com/InternLM/opencompass/releases/download/0.1.1/OpenCompassData.zip +wget https://github.com/open-compass/opencompass/releases/download/0.1.1/OpenCompassData.zip unzip OpenCompassData.zip ``` @@ -77,7 +77,7 @@ OpenCompass 已经支持了大多数常用于性能比较的数据集,具体 我们会以测试 [OPT-125M](https://huggingface.co/facebook/opt-125m) 以及 [OPT-350M](https://huggingface.co/facebook/opt-350m) 预训练基座模型在 [SIQA](https://huggingface.co/datasets/social_i_qa) 和 [Winograd](https://huggingface.co/datasets/winogrande) 上的性能为例,带领你熟悉 OpenCompass 的一些基本功能。 运行前确保已经安装了 OpenCompass,本实验可以在单张 _GTX-1660-6G_ 显卡上成功运行。 -更大参数的模型,如 Llama-7B, 可参考 [configs](https://github.com/InternLM/opencompass/tree/main/configs) 中其他例子。 +更大参数的模型,如 Llama-7B, 可参考 [configs](https://github.com/open-compass/opencompass/tree/main/configs) 中其他例子。 ## 配置任务 @@ -268,7 +268,7 @@ datasets = [*siqa_datasets, *winograd_datasets] # 最后 config 需要包 数据集的配置通常为 'ppl' 和 'gen' 两类配置文件,表示使用的评估方式。其中 `ppl` 表示使用判别式评测, `gen` 表示使用生成式评测。 -此外,[configs/datasets/collections](https://github.com/InternLM/OpenCompass/blob/main/configs/datasets/collections) 存放了各类数据集集合,方便做综合评测。OpenCompass 常用 [`base_medium.py`](/configs/datasets/collections/base_medium.py) 对模型进行全量测试。若需要复现结果,直接导入该文件即可。如: +此外,[configs/datasets/collections](https://github.com/open-compass/opencompass/blob/main/configs/datasets/collections) 存放了各类数据集集合,方便做综合评测。OpenCompass 常用 [`base_medium.py`](/configs/datasets/collections/base_medium.py) 对模型进行全量测试。若需要复现结果,直接导入该文件即可。如: ```bash python run.py --models hf_llama_7b --datasets base_medium diff --git a/docs/zh_cn/notes/contribution_guide.md b/docs/zh_cn/notes/contribution_guide.md index 2ccc3702..fcd3e18f 100644 --- a/docs/zh_cn/notes/contribution_guide.md +++ b/docs/zh_cn/notes/contribution_guide.md @@ -43,7 +43,7 @@ - 当你第一次提 PR 时 复刻 OpenCompass 原代码库,点击 GitHub 页面右上角的 **Fork** 按钮即可 - ![avatar](https://github.com/InternLM/opencompass/assets/22607038/851ed33d-02db-49c9-bf94-7c62eee89eb2) + ![avatar](https://github.com/open-compass/opencompass/assets/22607038/851ed33d-02db-49c9-bf94-7c62eee89eb2) 克隆复刻的代码库到本地 @@ -111,7 +111,7 @@ git checkout main -b branchname - 创建一个拉取请求 - ![avatar](https://github.com/InternLM/opencompass/assets/22607038/08feb221-b145-4ea8-8e20-05f143081604) + ![avatar](https://github.com/open-compass/opencompass/assets/22607038/08feb221-b145-4ea8-8e20-05f143081604) - 修改拉取请求信息模板,描述修改原因和修改内容。还可以在 PR 描述中,手动关联到相关的议题 (issue),(更多细节,请参考[官方文档](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue))。 diff --git a/docs/zh_cn/prompt/chain_of_thought.md b/docs/zh_cn/prompt/chain_of_thought.md index 127cb483..0cf66e15 100644 --- a/docs/zh_cn/prompt/chain_of_thought.md +++ b/docs/zh_cn/prompt/chain_of_thought.md @@ -4,7 +4,7 @@ CoT(思维链)是帮助大型语言模型解决如数学问题和关系推理问题等复杂问题的有效方式,在OpenCompass中,我们支持多种类型的CoT方法。 -![image](https://github.com/InternLM/opencompass/assets/28834990/45d60e0e-02a1-49aa-b792-40a1f95f9b9e) +![image](https://github.com/open-compass/opencompass/assets/28834990/45d60e0e-02a1-49aa-b792-40a1f95f9b9e) ## 1. 零样本思维链 @@ -72,7 +72,7 @@ gsm8k_eval_cfg = dict(sc_size=SAMPLE_SIZE) 其中 `SAMPLE_SIZE` 是推理路径的数量,较高的值通常会带来更高的性能。SC方法的原论文中展示了不同推理任务间推理路径数量与性能之间的关系: -![image](https://github.com/InternLM/opencompass/assets/28834990/05c7d850-7076-43ca-b165-e6251f9b3001) +![image](https://github.com/open-compass/opencompass/assets/28834990/05c7d850-7076-43ca-b165-e6251f9b3001) 从图中可以看出,在不同的推理任务中,随着推理路径数量的增加,性能呈现出增长的趋势。但是,对于某些任务,增加推理路径的数量可能达到一个极限,进一步增加推理路径的数量可能不会带来更多的性能提升。因此,需要在具体任务中进行实验和调整,找到最适合任务的推理路径数量。 diff --git a/docs/zh_cn/user_guides/config.md b/docs/zh_cn/user_guides/config.md index f107225c..92d66de5 100644 --- a/docs/zh_cn/user_guides/config.md +++ b/docs/zh_cn/user_guides/config.md @@ -99,7 +99,7 @@ models = [ 我们会以 PIQA 数据集配置文件为示例,展示数据集配置文件中各个字段的含义。 如果你不打算修改模型测试的 prompt,或者添加新的数据集,则可以跳过这一节的介绍。 -PIQA 数据集 [配置文件](https://github.com/InternLM/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) +PIQA 数据集 [配置文件](https://github.com/open-compass/opencompass/blob/main/configs/datasets/piqa/piqa_ppl_1cf9f0.py) 如下,这是一个基于 PPL(困惑度)进行评测的配置,并且不使用上下文学习方法(In-Context Learning)。 ```python diff --git a/docs/zh_cn/user_guides/metrics.md b/docs/zh_cn/user_guides/metrics.md index 93f43b5e..e7cfbb39 100644 --- a/docs/zh_cn/user_guides/metrics.md +++ b/docs/zh_cn/user_guides/metrics.md @@ -12,7 +12,7 @@ ## 已支持评估指标 -目前 OpenCompass 中,常用的 Evaluator 主要放在 [`opencompass/openicl/icl_evaluator`](https://github.com/InternLM/opencompass/tree/main/opencompass/openicl/icl_evaluator)文件夹下, 还有部分数据集特有指标的放在 [`opencompass/datasets`](https://github.com/InternLM/opencompass/tree/main/opencompass/datasets) 的部分文件中。以下是汇总: +目前 OpenCompass 中,常用的 Evaluator 主要放在 [`opencompass/openicl/icl_evaluator`](https://github.com/open-compass/opencompass/tree/main/opencompass/openicl/icl_evaluator)文件夹下, 还有部分数据集特有指标的放在 [`opencompass/datasets`](https://github.com/open-compass/opencompass/tree/main/opencompass/datasets) 的部分文件中。以下是汇总: | 评估指标 | 评估策略 | 常用后处理方式 | 数据集 | | ------------------ | -------------------- | --------------------------- | -------------------------------------------------------------------- | @@ -33,7 +33,7 @@ 评估标准配置一般放在数据集配置文件中,最终的 xxdataset_eval_cfg 会传给 `dataset.infer_cfg` 作为实例化的一个参数。 -下面是 `govrepcrs_eval_cfg` 的定义, 具体可查看 [configs/datasets/govrepcrs](https://github.com/InternLM/opencompass/tree/main/configs/datasets/govrepcrs)。 +下面是 `govrepcrs_eval_cfg` 的定义, 具体可查看 [configs/datasets/govrepcrs](https://github.com/open-compass/opencompass/tree/main/configs/datasets/govrepcrs)。 ```python from opencompass.openicl.icl_evaluator import BleuEvaluator