OpenCompass/opencompass
klein e4830a6926
Update CIBench (#1089)
* modify the requirements/runtime.txt: numpy==1.23.4 --> numpy>=1.23.4

* update cibench: dataset and evluation

* cibench summarizer bug

* update cibench

* move extract_code import

---------

Co-authored-by: zhangchuyu@pjlab.org.cn <zhangchuyu@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 18:46:02 +08:00
..
cli [Sync] deprecate old mbpps (#1064) 2024-04-19 20:49:46 +08:00
datasets Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
lagent Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
multimodal [Feature]: To be compatible with the latest version of MiniGPT-4 (#539) 2023-11-04 09:50:36 +08:00
openicl [Feature] Support Math evaluation via judgemodel (#1094) 2024-04-26 14:56:23 +08:00
partitioners [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
runners [Fix] Fix sequential runner (#1070) 2024-04-23 11:31:10 +08:00
summarizers Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
tasks [Feature] Support Math evaluation via judgemodel (#1094) 2024-04-26 14:56:23 +08:00
utils [Feature] Support Math evaluation via judgemodel (#1094) 2024-04-26 14:56:23 +08:00
__init__.py [Sync] Bump version to 0.2.4 (#1052) 2024-04-16 18:09:46 +08:00
registry.py [Sync] update taco (#1030) 2024-04-09 17:50:23 +08:00