OpenCompass/opencompass
Mo Li b50d163265
[Fix] Refactor Needlebench Configs for CLI Testing Support (#1020)
* add needlebench datasets suffix

* fix import

* update run.py args for summarizer key and dataset suffix

* update utils/run.py
2024-04-07 15:12:56 +08:00
..
datasets [Feature] update needlebench and configs (#986) 2024-03-25 18:05:01 +08:00
lagent [Feat] minor update agent related (#839) 2024-01-26 14:15:51 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models [Feature] Support AlpacaEval_V2 (#1006) 2024-03-28 16:49:04 +08:00
multimodal [Feature]: To be compatible with the latest version of MiniGPT-4 (#539) 2023-11-04 09:50:36 +08:00
openicl [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
partitioners [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
runners [Fix] base.py change status into list (#994) 2024-03-22 17:06:34 +08:00
summarizers [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
tasks [Feature] Add multi-model judge and fix some problems (#1016) 2024-04-02 11:52:06 +08:00
utils [Fix] Refactor Needlebench Configs for CLI Testing Support (#1020) 2024-04-07 15:12:56 +08:00
__init__.py [Sync] Bump version 0.2.3 (#957) 2024-03-12 11:51:56 +08:00
registry.py [Sync] update github token (#475) 2023-10-13 06:50:54 -05:00