OpenCompass/opencompass/configs/datasets/humaneval_pro
Dongsheng Zhu 2c79dc5227
[Dataset] Add human_eval/mbpp pro (#2092)
* add bench

* update

* bug fix

* time update

* add index

* fix repeat bug
2025-05-12 18:38:13 +08:00
..
humaneval_pro_gen_3dc067.py [Dataset] Add human_eval/mbpp pro (#2092) 2025-05-12 18:38:13 +08:00
humaneval_pro_gen.py [Dataset] Add human_eval/mbpp pro (#2092) 2025-05-12 18:38:13 +08:00
humaneval_pro_repeat_gen_3dc067.py [Dataset] Add human_eval/mbpp pro (#2092) 2025-05-12 18:38:13 +08:00
README.md [Dataset] Add human_eval/mbpp pro (#2092) 2025-05-12 18:38:13 +08:00

HumanEval pro

OC results

model pass@1
qwen2.5-coder-7b-instruct-hf 65
qwen2.5-14b-instruct-hf 67
deepseek-v2-lite-chat-hf 35

CodeEval-pro results

model pass@1
qwen2.5-coder-7b-instruct-hf 65
qwen2.5-14b-instruct-hf 65
deepseek-v2-lite-chat-hf 28