OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

History

Wei Li b84518c656 [Dataset] Support MedMCQA and MedBullets benchmark (#2054 ) * support medmcqa and medbullets benchmark * Add Medbullets data folder for benchmark support * revise gen name * revise config file & remove csv file & add dataset info to dataset-index.yml * remove csv file * remove print in medbullets.py * revise class name * update_oss_info --------- Co-authored-by: MaiziXiao <xxllcc1993@gmail.com>		2025-05-13 17:10:50 +08:00
..
__init__.py	[Refactor] Refactorize openicl eval task (#1990 )	2025-04-09 15:52:23 +08:00
abbr.py	[Feature] Add multi-model judge and fix some problems (#1016 )	2024-04-02 11:52:06 +08:00
auxiliary.py	[Feat] support humaneval and mbpp pass@k (#598 )	2023-11-16 21:22:06 +08:00
build.py	[Feature] Support Dataset Repeat and G-Pass Compute for Each Evaluator (#1886 )	2025-02-26 19:43:12 +08:00
collect_env.py	[Feature] Update pip install (#1324 )	2024-07-29 18:32:50 +08:00
datasets_info.py	[Dataset] Support MedMCQA and MedBullets benchmark (#2054 )	2025-05-13 17:10:50 +08:00
datasets.py	[Update] Update o1 eval prompt (#1806 )	2025-01-07 00:14:32 +08:00
dependency.py	[Feature]: Use multimodal (#73 )	2023-08-03 11:07:50 +08:00
dict_postprocessors.py	[Feature] Add Judgerbench and reorg subeval (#1593 )	2024-10-15 16:36:05 +08:00
file.py	fix output typing, change mutable list to immutable tuple (#989 )	2024-04-26 23:07:34 +08:00
fileio.py	[Update] Update Skywork/Qwen-QwQ (#1728 )	2024-12-05 19:30:43 +08:00
lark.py	[Feature] Several enhancements (#142 )	2023-08-01 18:19:49 +08:00
logging.py	[Enhance] Supress warning raised by get_logger (#353 )	2023-09-04 15:27:08 +08:00
menu.py	[Feat] Support local runner for windows (#515 )	2023-10-27 17:16:22 +08:00
network.py	[Update] Update Skywork/Qwen-QwQ (#1728 )	2024-12-05 19:30:43 +08:00
prompt.py	Support wildbench (#1266 )	2024-06-24 13:16:27 +08:00
result_station.py	[Fix] Fix CLI option for results persistence (#1920 )	2025-03-07 18:24:30 +08:00
run.py	[Update] Update OlympiadBench and Update LLM Judge (#1954 )	2025-03-18 20:15:20 +08:00
text_postprocessors.py	[Feature] Math Verify with model post_processor (#1881 )	2025-02-20 19:32:12 +08:00
types.py	[Sync] Initial support of subjective evaluation (#421 )	2023-09-22 15:42:31 +08:00