wangjingchao
a3bac3611a
[Fix] Fix bugs when adding QwQ models
2025-03-13 17:16:16 +08:00
Hoter Young
4ef3c5083b
[Feature] Support QwQ-32B and QwQ-Plus
2025-03-13 17:07:15 +08:00
Hoter Young
6b1671e029
[Chores] Change datasets path
2025-03-12 17:29:01 +08:00
Hoter Young
b3b5bacc4f
[Feature] Ensure QwQ pred are processed before evaluation for configed
...
datasets
2025-02-15 14:12:16 +08:00
Hoter Young
879b181c1b
add some features ( #32 )
...
* [Feature] Support answer extraction of QwQ when evaluating HuSimpleQA
* [Feature] Support mulit-language summarization in HuSimpleQASummarizer
* [Feature] Support DeepSeep-R1-Distill-Qwen_32B_turbomind
2025-02-14 20:44:53 +08:00
Hoter Young
4114079aed
[Fix] Fix HuSimpleQASummarizer bug ( #28 )
2025-02-13 11:28:49 +08:00
hoteryoung
23210e089a
[Refactor] Change HuSimpleQA to subjective evaluation
2025-02-12 20:25:03 +08:00
wujiang
1e1acf9236
add HuSimpleQA
2025-02-10 21:22:45 +08:00