mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00

* Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml * Update daily-run-test.yml * Update oc_score_assert.py --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
32 lines
616 B
YAML
32 lines
616 B
YAML
internlm-7b-hf:
|
|
ARC-c: 34.24
|
|
chid-dev: 79.70
|
|
chid-test: 81.12
|
|
openai_humaneval: 10.98
|
|
openbookqa: 47.20
|
|
openbookqa_fact: 74.00
|
|
|
|
internlm-chat-7b-hf:
|
|
ARC-c: 36.95
|
|
chid-dev: 71.78
|
|
chid-test: 76.87
|
|
openai_humaneval: 21.34
|
|
openbookqa: 66.6
|
|
openbookqa_fact: 80.4
|
|
|
|
chatglm3-6b-base-hf:
|
|
ARC-c: 44.41
|
|
chid-dev: 78.22
|
|
chid-test: 78.57
|
|
openai_humaneval: 20.73
|
|
openbookqa: 78.40
|
|
openbookqa_fact: 92.00
|
|
|
|
internlm2-7b-hf:
|
|
ARC-c: 34.92
|
|
chid-dev: 55.94
|
|
chid-test: 53.70
|
|
openai_humaneval: 44.51
|
|
openbookqa: 83.00
|
|
openbookqa_fact: 83.00
|