mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00
![]() * added base_models_abbrs to references (passed from LMEvaluator); added bradleyterry subjective evaluation method for wildbench, alpacaeval, and compassarena datasets; added all_scores output files for reference in CompassArenaBradleyTerrySummarizer; * added bradleyterry subjective evaluation method to arena_hard dataset |
||
---|---|---|
.. | ||
alignbench | ||
alpaca_eval | ||
arena_hard | ||
compass_arena_subjective_bench | ||
compassarena | ||
compassbench | ||
flames | ||
fofo | ||
followbench | ||
hellobench | ||
judgerbench | ||
multiround | ||
wildbench |