OpenCompass/opencompass/configs/datasets/mbpp_pro/README.md
Dongsheng Zhu 2c79dc5227
[Dataset] Add human_eval/mbpp pro (#2092)
* add bench

* update

* bug fix

* time update

* add index

* fix repeat bug
2025-05-12 18:38:13 +08:00

17 lines
472 B
Markdown

# MBPP pro
## OC results
| model | pass@1 |
|:--------------------------:|---------:|
|qwen2.5-coder-7b-instruct-hf| 66 |
| qwen2.5-14b-instruct-hf | 64 |
| deepseek-v2-lite-chat-hf | 36 |
## CodeEval-pro results
| model | pass@1 |
|:--------------------------:|---------:|
|qwen2.5-coder-7b-instruct-hf| 65 |
| qwen2.5-14b-instruct-hf | 65 |
| deepseek-v2-lite-chat-hf | 39 |