OpenCompass/opencompass
TTTTTiam 2a62bea1a4
add evaluation of scibench (#393)
* add evaluation of scibench

* add evaluation of scibench

* update scibench

* remove scibench evaluator

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 17:42:08 +08:00
..
datasets add evaluation of scibench (#393) 2023-09-22 17:42:08 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models Support GSM8k evaluation with tools by Lagent and LangChain (#277) 2023-09-22 15:28:22 +08:00
multimodal [Fix] Fix performance issue of visualglm. (#424) 2023-09-21 19:54:23 +08:00
openicl [Sync] Initial support of subjective evaluation (#421) 2023-09-22 15:42:31 +08:00
partitioners [Fix] keep keys (#431) 2023-09-22 17:30:54 +08:00
runners [Sync] Initial support of subjective evaluation (#421) 2023-09-22 15:42:31 +08:00
tasks add evaluation of scibench (#393) 2023-09-22 17:42:08 +08:00
utils [Sync] Initial support of subjective evaluation (#421) 2023-09-22 15:42:31 +08:00
__init__.py Bump version to 0.1.4 (#367) 2023-09-08 20:51:38 +08:00
registry.py [Feature] Add Tree-of-Thought method (#173) 2023-08-23 12:23:05 +08:00