OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

History

liushz a6f67e1a65 [Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>		2024-04-28 21:58:58 +08:00
..
__init__.py	[Feature] add support for Flames datasets (#1093 )	2024-04-28 18:56:24 +08:00
alignmentbench.py	[Sync] update taco (#1030 )	2024-04-09 17:50:23 +08:00
all_obj.py	[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103 )	2024-04-28 21:58:58 +08:00
alpacaeval.py	[Feature] Add multi-model judge and fix some problems (#1016 )	2024-04-02 11:52:06 +08:00
arenahard.py	[Feature] support arenahard evaluation (#1096 )	2024-04-26 15:42:00 +08:00
compass_arena.py	[Sync] deprecate old mbpps (#1064 )	2024-04-19 20:49:46 +08:00
corev2.py	reorganize subject files (#801 )	2024-01-16 18:03:11 +08:00
creationbench.py	reorganize subject files (#801 )	2024-01-16 18:03:11 +08:00
flames.py	[Feature] add support for Flames datasets (#1093 )	2024-04-28 18:56:24 +08:00
information_retrival.py	reorganize subject files (#801 )	2024-01-16 18:03:11 +08:00
mtbench.py	[Sync] deprecate old mbpps (#1064 )	2024-04-19 20:49:46 +08:00
multiround.py	[Fix] Fix MultiRound Subjective Evaluation(#1043 )	2024-04-22 12:06:03 +08:00
subjective_post_process.py	reorganize subject files (#801 )	2024-01-16 18:03:11 +08:00
utils.py	fix compass arena (#854 )	2024-01-30 16:34:38 +08:00