diff --git a/docs/en/user_guides/datasets.md b/docs/en/user_guides/datasets.md index c392fefb..5e73485f 100644 --- a/docs/en/user_guides/datasets.md +++ b/docs/en/user_guides/datasets.md @@ -100,14 +100,8 @@ afqmc_datasets = [ ), ] ``` - -Additionally, for binary evaluation metrics (such as accuracy, pass-rate, etc.), you can also set the parameter `k` in conjunction with `n` for [G-Pass@ k](http://arxiv.org/abs/2412.13147) evaluation. The formula for G-Pass@$k$ is: - -$$ -\text{G-Pass@}k_\tau=\mathbb{E}_{\text{Data}}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], -$$ - -where $n$ is the number of evaluations, and $c$ is the number of times that passed or were correct out of $n$ runs. An example configuration is as follows: +> [!TIP] +> Additionally, for binary evaluation metrics (such as accuracy, pass-rate, etc.), you can also set the parameter `k` in conjunction with `n` for [G-Pass@ k](http://arxiv.org/abs/2412.13147) evaluation. The formula for G-Pass@$k$ is: $$ G-Pass@k_\tau=\mathbb{E}_{Data}\left[ \sum_{j=\lceil \tau \cdot k \rceil}^c \frac{{c \choose j} \cdot {n - c \choose k - j}}{{n \choose k}} \right], $$ where $n$ is the number of evaluations, and $c$ is the number of times that passed or were correct out of $n$ runs. An example configuration is as follows: ```python aime2024_datasets = [