OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Leymore	861942ab1b	[Feature] Add lawbench (#460 ) * add lawbench * update requirements * update	2023-10-13 06:51:36 -05:00
Leymore	fbf5089c40	[Sync] update github token (#475 )	2023-10-13 06:50:54 -05:00
Leymore	d7ff933a73	[Fix] Use jieba rouge in lcsts (#459 ) * use jieba rouge in lcsts * use rouge_chinese	2023-10-09 10:10:33 +08:00
Tong Gao	119bfd1569	[Refactor] Move fix_id_list to Retriever (#442 ) * [Refactor] Move fix_id_list to Retriever * update * move to base * fix	2023-10-07 12:53:41 +08:00
Lyu Han	6738247142	Integrate turbomind inference via its RPC API instead of its python API (#414 ) * support tis * integrate turbomind inference via its RPC API instead of its python API * update guide * update ip address spec * update according to reviewer's comments	2023-10-07 10:27:48 +08:00
Leymore	9db5652638	[Feature] re-implement ceval load dataset (#446 )	2023-09-27 21:18:48 +08:00
philipwangOvO	3bb3d330eb	[Sync] Update LongEval (#443 )	2023-09-27 16:32:40 +08:00
Kevin Wang	dc1b82c346	[SIG] add GLUE_MRPC dataset (#440 )	2023-09-27 11:44:54 +08:00
Kevin Wang	14fdecfecc	[Dataset] add GLUE QQP dataset (#438 )	2023-09-27 11:36:43 +08:00
Kevin Wang	d8354fe5d8	[SIG] add GLUE_CoLA dataset (#406 ) * [Dataset] add GLUE_CoLA dataset * [update] use HFDataset to load glue/cola dataset * update --------- Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>	2023-09-27 11:30:44 +08:00
Kevin Wang	012546666b	[SIG] add WikiText-2&103 (#397 ) * fix conflict * add eval_cfg	2023-09-26 14:31:15 +08:00
liushz	c5224c2a91	[Feature] Add kaoshi dataset (#392 ) * Add ToT method * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Add Koashi * Update Kaoshi * Update Kaoshi * Update kaoshi * Update kaoshi * Update Kaoshi * Update Kaoshi * Update Kaoshi * Update Kaoshi * update Kaoshi * update * update * fix --------- Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>	2023-09-22 18:46:33 +08:00
TTTTTiam	2a62bea1a4	add evaluation of scibench (#393 ) * add evaluation of scibench * add evaluation of scibench * update scibench * remove scibench evaluator --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 17:42:08 +08:00
Ma Zerun	0f2c388280	Support GSM8k evaluation with tools by Lagent and LangChain (#277 ) * Support GSM8k evaluation with tools by Lagent and LangChain * Avoid to use MMEngine new feature * update document --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 15:28:22 +08:00
Yike Yuan	97fdc51102	[Fix] Fix performance issue of visualglm. (#424 ) * [Fix] Visualglm performance fixed. * [Fix] Hide ckpt path.	2023-09-21 19:54:23 +08:00
Hubert	8803f7f7a6	[Feat] support antropics evals dataset (#422 ) * [Feat] support anthropics ai risk dataset * [Feat] support anthropics evals dataset * [Feat] support anthropics evals dataset	2023-09-20 18:36:44 +08:00
Yike Yuan	bd50bad8b5	[Feat] Support mm models on public dataset and fix several issues. (#412 ) * [Feat] Add public dataset support for visualglm, qwenvl, and flamingo * [Fix] MMBench related changes. * [Fix] Openflamingo inference. * [Fix] Hide ckpt path. * [Fix] Pre-commit. --------- Co-authored-by: Haodong Duan <dhd.efz@gmail.com>	2023-09-19 19:08:44 +08:00
Yuanhan Zhang	7c2726c23b	[Model] Yhzhang/add mlugowl llamaadapter (#405 ) * refine gitignore * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * [Feature]: Add minigpt-4 * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * lint * update * lint * lint * add __init__.py * update * update * update * update * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * [Feature]: Add minigpt-4 * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * lint * update * lint * lint * add __init__.py * update * update * update * update * optimize mmbench dataset args * update * update * run commit hook --------- Co-authored-by: liuyuan <3463423099@qq.com> Co-authored-by: kennymckormick <dhd@pku.edu.cn> Co-authored-by: kennymckormick <dhd.efz@gmail.com>	2023-09-19 14:21:26 +08:00
Hubert	2c15a0c01d	[Feat] refine docs and codes for more user guides (#409 )	2023-09-18 16:12:13 +08:00
Hubert	a11cb45c83	[Feat] implementation for support promptbench (#239 ) * [Feat] support adv_glue dataset for adversarial robustness * reorg files * minor fix * minor fix * support prompt bench demo * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix	2023-09-15 15:06:53 +08:00
Hubert	de8a154795	[Feat] support ds1000 dataset (#395 ) * [Feat] support ds1000 datase	2023-09-15 12:50:27 +08:00
Yuan Liu	545d50a4c0	[Fix]: Add has_image to scienceqa (#391 ) Co-authored-by: bensenliu <bensenliu@tencent.com>	2023-09-13 13:07:14 +08:00
Xidong Wang	47a752cd56	[Dataset] Add CMB (#376 ) * Add CMB * modify CMB --------- Co-authored-by: wangxidong <xidongw@163.com>	2023-09-12 19:16:41 +08:00
Tong Gao	b9b145c335	[Docs] Fix incorrect name in get_started (#380 )	2023-09-11 16:10:09 +08:00
Leymore	2c915218e8	[Feaure] Add new models: baichuan2, tigerbot, vicuna v1.5 (#373 ) * add bag of new models: baichuan2, tigerbot, vicuna v1.5 * update * re-organize models * update readme * update	2023-09-08 15:41:20 +08:00
Leymore	b48d084020	[Fix] update bbh implement & fix bbh suffix (#371 )	2023-09-08 15:14:30 +08:00
Yixiao Fang	fada77a31c	[Feature] Add open source dataset eval config of instruct-blip (#370 ) * add configs * refactor model * add post processor and prompt constructor	2023-09-08 15:07:09 +08:00
Tong Gao	b11838f80a	[Feature] Update claude2 postprocessor (#365 ) * [Feature] Update claude2 config * [Feature] Update claude2 postprocessor	2023-09-07 11:26:26 +08:00
Yike Yuan	b885ec84df	[Feat] Support Qwen-VL-Chat on MMBench. (#312 ) * [Feat] Support Qwen-VL base. * [Feat] Support Qwen-VL-Chat on MMBench. * [Fix] Add postprocessor and fix format. * [Fix] Add type hint and remove redundant codes. * [Fix] fix bugs in postprocessor. * [Fix] Use given commit id.	2023-09-06 18:42:19 +08:00
Hubert	ddb8197212	[Feat] support wizardcoder series (#344 ) * [Feat] support wizardcoder series * minor fix	2023-09-06 17:52:35 +08:00
Leymore	764c2f799a	[Fix] update qwen config (#358 )	2023-09-05 10:15:19 +08:00
Yuanhan Zhang	f2dd98ca7a	[Feat] Support LLaVA and mPLUG-Owl (#331 ) * refine gitignore * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * [Feature]: Add minigpt-4 * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * lint * update * lint * lint * add __init__.py * update * update * update --------- Co-authored-by: liuyuan <3463423099@qq.com>	2023-09-01 23:32:05 +08:00
Tong Gao	166022f568	[Docs] Update docs for new entry script (#246 ) * update docs * update docs * update * update en docs * update * update --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-08-31 16:43:55 +08:00
Li Bo	a4d6840739	[Feat] Add Otter to OpenCompass MMBench Evaluation (#232 ) * add otter model for opencompass mmbench * add docs * add readme docs * debug for otter opencomass eval * delete unused folders * change to default data path * remove unused files * remove unused files * update * update config file * flake8 lint formated and add prompt generator * add prompt generator to config * add a specific postproecss * add post processor * add post processor * add post processor * update according to suggestions * remove unused redefinition	2023-08-31 12:55:53 +08:00
Leymore	7ca6ba625e	[Feature] Add qwen & qwen-chat support (#286 ) * add and apply update suffix tool * add tool doc * add qwen configs * add cmmlu * rename bbh * update datasets * delete * update hf_qwen_7b.py	2023-08-31 11:29:05 +08:00
Hubert	fd389e2d78	[Feat] support codellama and preds collection tools (#335 )	2023-08-31 11:14:42 +08:00
Leymore	c26ecdb1b0	[Feature] Add and apply update suffix tool (#280 ) * add and apply update suffix tool * add dataset suffix updater as precommit hook * update workflow * update scripts * update ci * update * ci with py3.8 * run in serial * update bbh * use py 3.10 * update pre commit zh cn	2023-08-28 17:35:04 +08:00
Tong Gao	9058be07b8	[Feature] Simplify entry script (#204 ) * [Feature] Simply entry script * update	2023-08-25 17:36:30 +08:00
Tong Gao	f480b72703	[Feature] Support model-bound prediction postprocessor, use it in Claude (#268 ) * [Feature] Support model-bound text postprocessor, add claude as an example * update * update * minor fix --------- Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>	2023-08-25 16:12:21 +08:00
Tong Gao	fda42fd5fd	[Fix] wrong path in dataset collections (#272 )	2023-08-25 15:50:30 +08:00
Yike Yuan	3f601f420b	[Feat] Support public dataset of visualglm and llava. (#265 ) * [Feat] Add public dataset support of VisualGLM. * [Feat] Refactor LLaVA. * [Feat] Add public dataset support of LlaVA. * [Fix] Add arg.	2023-08-25 15:44:32 +08:00
Yuan Liu	dc6e54f6f4	[Feature]: Verify the acc of these public datasets (#269 ) * [Feature]: Refactor public dataset eval * [Feature]: Verify public dataset acc	2023-08-25 15:01:58 +08:00
philipwangOvO	3f37c40aa3	[Dataset] Refactor LEval	2023-08-25 11:46:23 +08:00
Tong Gao	60c2d3d76b	[Feature] Add Claude support (#253 ) * [Feature] Add Claude support * [Feature] Add Claude support * Update opencompass/models/claude_api.py Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> * raise import erorr --------- Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>	2023-08-24 14:29:45 +08:00
Yuan Liu	343f785b07	[Feature]: Add Flamingo (#258 ) * [Feature]: Add Openflamingo MMBench * [Fix]: Fix import error * [Fix]: Revert task config * [Fix]: Fix path bug	2023-08-24 14:11:29 +08:00
Yixiao Fang	1034c487ef	[Refactor] Refactor instructblip (#227 ) * refactor instructblip * add post processor * add forward * fix lint * update * update	2023-08-23 15:33:59 +08:00
liushz	02ce139bc6	[Feature] Add Tree-of-Thought method (#173 ) * Add ToT method * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update chain_of_thought.md * Update icl_tot_inferencer.py --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2023-08-23 12:23:05 +08:00
Leymore	ff5ab92331	[Feature] Add llama2 native implements (#235 ) * add llama2 native implements * rename configs/eval_llama_7b.py --------- Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>	2023-08-23 11:33:25 +08:00
Yike Yuan	8d368d1cd6	[Feat] Support visualglm and llava for MMBench evaluation. (#211 ) * [Feat] Support visualglm inference on MMBench. * [Feat] Support llava inference on MMBench. * [Fix] Fix pre-commit format. * [Fix] Add docstring for llava * [Fix] Fix multi-process inference error of LlaVA and add comments. 1. Set `low_cpu_mem_usage` to False to address device issue. 2. Add docstring and type hints. 3. Rename class and remove registry. * [Fix] Pre-commit fix. * [Fix] add forward entry, add dynamic import to seedbench * [Fix] Fix pre-commit. * [Fix] Fix missing context. * [Fix] Fix docstring.	2023-08-21 15:57:30 +08:00
Yike Yuan	a6552224cb	[Feat] Support multi-modal evaluation on MME benchmark. (#197 ) * [Feat] Support multi-modal evaluation on MME benchmark. * [Fix] Remove debug code. * [Fix] Remove redundant codes and add type hints. * [Fix] Rename in config. * [Fix] Rebase main. * [Fix] Fix isort and yapf conflict.	2023-08-21 15:53:20 +08:00

1 2 3

101 Commits