OpenCompass/opencompass
liushz a6f67e1a65
[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103)
* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Fix Llama-3 meta template

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-04-28 21:58:58 +08:00
..
cli [Deperecate] Remove multi-modal related stuff (#1072) 2024-04-26 21:20:14 +08:00
datasets [Feature] add support for Flames datasets (#1093) 2024-04-28 18:56:24 +08:00
lagent Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models adapt to lmdeploy v0.4.0 (#1073) 2024-04-28 19:57:40 +08:00
openicl [Feature] Support Math evaluation via judgemodel (#1094) 2024-04-26 14:56:23 +08:00
partitioners [Deperecate] Remove multi-modal related stuff (#1072) 2024-04-26 21:20:14 +08:00
runners [Fix] Fix sequential runner (#1070) 2024-04-23 11:31:10 +08:00
summarizers [Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103) 2024-04-28 21:58:58 +08:00
tasks [Fix] python path bug (#1063) 2024-04-26 21:58:45 +08:00
utils fix output typing, change mutable list to immutable tuple (#989) 2024-04-26 23:07:34 +08:00
__init__.py [Sync] Bump version to 0.2.4 (#1052) 2024-04-16 18:09:46 +08:00
registry.py [Deperecate] Remove multi-modal related stuff (#1072) 2024-04-26 21:20:14 +08:00