OpenCompass/opencompass
Chang Lan a927bba1cf
[Fix] Fix RULER datasets (#1628)
We need to ensure that we don't import anything that ends with "_datasets",
or they will be picked up by the runner, leading to duplicate / unwanted datasets
being evaluated.
2024-10-22 11:59:02 +08:00
..
cli [Feature] Support import configs/models/summarizers from whl (#1376) 2024-08-01 00:42:48 +08:00
configs [Fix] Fix RULER datasets (#1628) 2024-10-22 11:59:02 +08:00
datasets [Feature] Support LiveCodeBench (#1617) 2024-10-21 20:50:39 +08:00
lagent Update CIBench (#1089) 2024-04-26 18:46:02 +08:00
metrics [Feat] Support multi-modal evaluation on MME benchmark. (#197) 2023-08-21 15:53:20 +08:00
models [Bug] Fix-NPU-Support (#1618) 2024-10-21 17:42:53 +08:00
openicl [Feature] Support LiveCodeBench (#1617) 2024-10-21 20:50:39 +08:00
partitioners [Fix] fix duplicate error in partitioner (#1552) 2024-09-23 19:45:21 +08:00
runners [Fix] Update dlc runner python env (#1604) 2024-10-14 15:50:21 +08:00
summarizers Upload HelloBench (#1607) 2024-10-15 17:11:37 +08:00
tasks [Feature] Add model postprocess function (#1484) 2024-09-05 21:10:29 +08:00
utils [Feature] Support LiveCodeBench (#1617) 2024-10-21 20:50:39 +08:00
__init__.py [BUMP] Bump version to 0.3.3 (#1581) 2024-09-30 16:57:41 +08:00
registry.py [Feature] Add Judgerbench and reorg subeval (#1593) 2024-10-15 16:36:05 +08:00