OpenCompass/.github/scripts/oc_score_baseline.yaml
zhulinJulia24 abdcee68f6
[CI] Update daily test metrics threshold (#1812)
* Update daily-run-test.yml

* Update pr-run-test.yml

* update

* update

* update

* updaet

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: MaiziXiao <xxllcc1993@gmail.com>
2025-01-09 18:16:24 +08:00

35 lines
809 B
YAML

internlm2_5-7b-hf:
demo_gsm8k_accuracy: 42.19
race-middle_accuracy: 91.78
race-high_accuracy: 90.02
internlm2_5-7b_hf:
demo_gsm8k_accuracy: 42.19
race-middle_accuracy: 91.78
race-high_accuracy: 90.02
internlm2-1.8b-hf:
demo_gsm8k_accuracy: 15.62
race-middle_accuracy: 71.66
race-high_accuracy: 66.38
internlm2_5-7b-chat-lmdeploy:
demo_gsm8k_accuracy: 89.06
race-middle_accuracy: 92.76
race-high_accuracy: 90.54
internlm2-chat-1.8b-lmdeploy:
demo_gsm8k_accuracy: 31
race-middle_accuracy: 81.34
race-high_accuracy: 73.96
internlm2_5-7b-chat_hf:
demo_gsm8k_accuracy: 87.50
race-middle_accuracy: 92.76
race-high_accuracy: 90.48
lmdeploy-api-test:
gsm8k_accuracy: 68.75
race-middle_accuracy: 87.50
race-high_accuracy: 93.75