OpenCompass/opencompass/summarizers/subjective
liushz a6f67e1a65
[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103)
* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Fix Llama-3 meta template

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-04-28 21:58:58 +08:00
..
__init__.py [Feature] add support for Flames datasets (#1093) 2024-04-28 18:56:24 +08:00
alignmentbench.py [Sync] update taco (#1030) 2024-04-09 17:50:23 +08:00
all_obj.py [Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103) 2024-04-28 21:58:58 +08:00
alpacaeval.py [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
arenahard.py [Feature] support arenahard evaluation (#1096) 2024-04-26 15:42:00 +08:00
compass_arena.py [Sync] deprecate old mbpps (#1064) 2024-04-19 20:49:46 +08:00
corev2.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
creationbench.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
flames.py [Feature] add support for Flames datasets (#1093) 2024-04-28 18:56:24 +08:00
information_retrival.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
mtbench.py [Sync] deprecate old mbpps (#1064) 2024-04-19 20:49:46 +08:00
multiround.py [Fix] Fix MultiRound Subjective Evaluation(#1043) 2024-04-22 12:06:03 +08:00
subjective_post_process.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
utils.py fix compass arena (#854) 2024-01-30 16:34:38 +08:00