OpenCompass/opencompass/configs
Alexander Lam f871e80887
[Feature] Add Bradley-Terry Subjective Evaluation method to Arena Hard dataset (#1802)
* added base_models_abbrs to references (passed from LMEvaluator); added bradleyterry subjective evaluation method for wildbench, alpacaeval, and compassarena datasets; added all_scores output files for reference in CompassArenaBradleyTerrySummarizer;

* added bradleyterry subjective evaluation method to arena_hard dataset
2025-01-03 16:33:43 +08:00
..
dataset_collections [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
datasets [Feature] Add Bradley-Terry Subjective Evaluation method to Arena Hard dataset (#1802) 2025-01-03 16:33:43 +08:00
models [Update] Update requirement and deepseek configurations (#1764) 2024-12-17 10:16:47 +08:00
summarizers [Feature] Add Openai Simpleqa dataset (#1720) 2024-11-28 19:16:07 +08:00