OpenCompass/opencompass
liushz 98c58f8a6c
[Feature] Add compassbench knowledge&math part (#1342)
* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Add Math Evaluation with Judge Model Evaluator

* Fix Llama-3 meta template

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Fix MATH with JudgeLM Evaluation

* Update acclerator

* Update MathBench

* Update accelerator

* Add Doc for accelerator

* Add Doc for accelerator

* Add Doc for accelerator

* Add Doc for accelerator

* Update compassbench august wiki&math

* Update compassbench august wiki&math

* Update compassbench august wiki&math

* Update compassbench_aug_gen_068af0.py

* Update compassbench_aug_gen_068af0.py

* Update

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2024-07-19 22:54:46 +08:00
..
cli [Fix] add bc for alignbench summarizer (#1306) 2024-07-12 11:06:20 +08:00
datasets [Feature] Add compassbench knowledge&math part (#1342) 2024-07-19 22:54:46 +08:00
lagent Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models [Fix] Fix lint (#1334) 2024-07-18 17:15:06 +08:00
openicl [Sync] Sync with internal codes 2024.06.28 (#1279) 2024-06-28 14:16:34 +08:00
partitioners [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
runners npu适配 (#1250) 2024-07-03 18:55:19 +08:00
summarizers [Doc] Update NeedleBench Docs (#1330) 2024-07-18 13:16:19 +08:00
tasks force register (#1311) 2024-07-11 19:59:35 +08:00
utils [Doc] quick start swap tabs (#1263) 2024-07-05 23:51:42 +08:00
__init__.py [Sync] bump version 0.2.6+local (#1294) 2024-07-06 00:44:06 +08:00
registry.py [Deperecate] Remove multi-modal related stuff (#1072) 2024-04-26 21:20:14 +08:00