mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00

* add daily test case * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update pr-run-test.yml * Update daily-run-test.yml * Update oc_score_assert.py * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * update testcase baseline * fix test case name * add more models into daily test --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>
24 lines
458 B
YAML
24 lines
458 B
YAML
internlm-7b-hf:
|
|
ARC-c: 36.27
|
|
chid-dev: 81.68
|
|
chid-test: 83.67
|
|
openai_humaneval: 10.37
|
|
openbookqa: 44.4
|
|
openbookqa_fact: 73.2
|
|
|
|
internlm-chat-7b-hf:
|
|
ARC-c: 36.95
|
|
chid-dev: 71.78
|
|
chid-test: 76.87
|
|
openai_humaneval: 21.34
|
|
openbookqa: 66.6
|
|
openbookqa_fact: 80.4
|
|
|
|
chatglm3-6b-base-hf:
|
|
ARC-c: 43.05
|
|
chid-dev: 80.2
|
|
chid-test: 80.77
|
|
openai_humaneval: 20.73
|
|
openbookqa: 79.8
|
|
openbookqa_fact: 92.2
|