diff --git a/docs/en/user_guides/datasets.md b/docs/en/user_guides/datasets.md index 1c34b714..fb11b394 100644 --- a/docs/en/user_guides/datasets.md +++ b/docs/en/user_guides/datasets.md @@ -101,13 +101,13 @@ afqmc_datasets = [ ``` -> Additionally, for binary evaluation metrics (such as accuracy, pass-rate, etc.), you can also set the parameter `k` in conjunction with `n` for [G-Pass@k](http://arxiv.org/abs/2412.13147) evaluation. The formula for G-Pass@k is: -> -> ```{math} -> \text{G-Pass@}k_\tau=E_{\text{Data}}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], -> ``` -> -> where $n$ is the number of evaluations, and $c$ is the number of times that passed or were correct out of $n$ runs. An example configuration is as follows: +Additionally, for binary evaluation metrics (such as accuracy, pass-rate, etc.), you can also set the parameter `k` in conjunction with `n` for [G-Pass@k](http://arxiv.org/abs/2412.13147) evaluation. The formula for G-Pass@k is: + +```{math} +\text{G-Pass@}k_\tau=E_{\text{Data}}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], +``` + +where $n$ is the number of evaluations, and $c$ is the number of times that passed or were correct out of $n$ runs. An example configuration is as follows: ```python aime2024_datasets = [ diff --git a/docs/zh_cn/user_guides/datasets.md b/docs/zh_cn/user_guides/datasets.md index 1c8dc616..b1494d3c 100644 --- a/docs/zh_cn/user_guides/datasets.md +++ b/docs/zh_cn/user_guides/datasets.md @@ -100,13 +100,13 @@ afqmc_datasets = [ ] ``` -> 另外,对于二值评测指标(例如accuracy,pass-rate等),还可以通过设置参数`k`配合`n`进行[G-Pass@k](http://arxiv.org/abs/2412.13147)评测。G-Pass@k计算公式为: -> -> ```{math} -> \text{G-Pass@}k_\tau=E_{\text{Data}}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], -> ``` -> -> 其中 $n$ 为评测次数, $c$ 为 $n$ 次运行中通过或正确的次数。配置例子如下: +另外,对于二值评测指标(例如accuracy,pass-rate等),还可以通过设置参数`k`配合`n`进行[G-Pass@k](http://arxiv.org/abs/2412.13147)评测。G-Pass@k计算公式为: + +```{math} +\text{G-Pass@}k_\tau=E_{\text{Data}}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], +``` + +其中 $n$ 为评测次数, $c$ 为 $n$ 次运行中通过或正确的次数。配置例子如下: ```python aime2024_datasets = [