OpenCompass/.github/scripts/oc_score_baseline.yaml
zhulinJulia24 a9d6b6461f
[ci] react daily test (#1668)
* updaste

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updaste

* update

* update

* refactor summarize

* update

* update

* update

* update

* update

* updaste

* update

* update

* update

* update

* updaste

* update

* update

* update

* update

* update

* updaste

* updaste

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update daily-run-test.yml

* Update daily-run-test.yml

* update

* update

* update

* update

* update

* Update daily-run-test.yml

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update daily-run-test.yml

* Update daily-run-test.yml

* update

* update

* Update daily-run-test.yml

* update

* update

* update

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-11-12 18:40:27 +08:00

35 lines
620 B
YAML

internlm2_5-7b-hf:
demo_gsm8k: 42.19
race-middle: 91.78
race-high: 90.02
internlm2_5-7b_hf:
demo_gsm8k: 42.19
race-middle: 91.78
race-high: 90.02
internlm2-1.8b-hf:
demo_gsm8k: 15.62
race-middle: 71.66
race-high: 66.38
internlm2_5-7b-chat-lmdeploy:
demo_gsm8k: 84.38
race-middle: 92.76
race-high: 90.54
internlm2-chat-1.8b-lmdeploy:
demo_gsm8k: 31
race-middle: 81.34
race-high: 73.96
internlm2_5-7b-chat_hf:
demo_gsm8k: 87.50
race-middle: 92.76
race-high: 90.48
lmdeploy-api-test:
gsm8k: 83.78
race-middle: 92.41
race-high: 90.37