OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Leymore	a1782f9a08	[Fix] triviaqa & nq postprocess (#350 )	2023-09-04 15:24:52 +08:00
Tong Gao	ce65d3393b	[Sync] Use finally to clean up temp files (#337 )	2023-09-04 15:20:16 +08:00
Yixiao Fang	2cd994c3d1	[Fix] add import check of multimodal (#352 )	2023-09-04 14:41:07 +08:00
Leymore	8774465a8f	[Enhancement] ignore ZeroRetriever error when id_list provided (#340 )	2023-09-04 11:12:16 +08:00
Yuanhan Zhang	f2dd98ca7a	[Feat] Support LLaVA and mPLUG-Owl (#331 ) * refine gitignore * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * [Feature]: Add minigpt-4 * [Feature]: Add instructblip * add otter and llama-adapter * add owl * add llama2-adapter and owl * lint * lint * update * lint * lint * add __init__.py * update * update * update --------- Co-authored-by: liuyuan <3463423099@qq.com>	2023-09-01 23:32:05 +08:00
Leymore	e810974068	[Fix] Fix when missing both pad and eos token (#287 ) * fix when missing both pad and eos token * update pad_token_id impl	2023-08-31 16:53:39 +08:00
Li Bo	a4d6840739	[Feat] Add Otter to OpenCompass MMBench Evaluation (#232 ) * add otter model for opencompass mmbench * add docs * add readme docs * debug for otter opencomass eval * delete unused folders * change to default data path * remove unused files * remove unused files * update * update config file * flake8 lint formated and add prompt generator * add prompt generator to config * add a specific postproecss * add post processor * add post processor * add post processor * update according to suggestions * remove unused redefinition	2023-08-31 12:55:53 +08:00
Leymore	7ca6ba625e	[Feature] Add qwen & qwen-chat support (#286 ) * add and apply update suffix tool * add tool doc * add qwen configs * add cmmlu * rename bbh * update datasets * delete * update hf_qwen_7b.py	2023-08-31 11:29:05 +08:00
Hubert	fd389e2d78	[Feat] support codellama and preds collection tools (#335 )	2023-08-31 11:14:42 +08:00
Tong Gao	9058be07b8	[Feature] Simplify entry script (#204 ) * [Feature] Simply entry script * update	2023-08-25 17:36:30 +08:00
Tong Gao	f480b72703	[Feature] Support model-bound prediction postprocessor, use it in Claude (#268 ) * [Feature] Support model-bound text postprocessor, add claude as an example * update * update * minor fix --------- Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>	2023-08-25 16:12:21 +08:00
Yike Yuan	3f601f420b	[Feat] Support public dataset of visualglm and llava. (#265 ) * [Feat] Add public dataset support of VisualGLM. * [Feat] Refactor LLaVA. * [Feat] Add public dataset support of LlaVA. * [Fix] Add arg.	2023-08-25 15:44:32 +08:00
Yuan Liu	dc6e54f6f4	[Feature]: Verify the acc of these public datasets (#269 ) * [Feature]: Refactor public dataset eval * [Feature]: Verify public dataset acc	2023-08-25 15:01:58 +08:00
philipwangOvO	3f37c40aa3	[Dataset] Refactor LEval	2023-08-25 11:46:23 +08:00
Tong Gao	60c2d3d76b	[Feature] Add Claude support (#253 ) * [Feature] Add Claude support * [Feature] Add Claude support * Update opencompass/models/claude_api.py Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> * raise import erorr --------- Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>	2023-08-24 14:29:45 +08:00
Yuan Liu	343f785b07	[Feature]: Add Flamingo (#258 ) * [Feature]: Add Openflamingo MMBench * [Fix]: Fix import error * [Fix]: Revert task config * [Fix]: Fix path bug	2023-08-24 14:11:29 +08:00
LZHgrla	77745a84ea	[Fix] Fix bugs for PeftModel generate (#252 ) * fix bugs * fix typo	2023-08-24 14:07:33 +08:00
Tong Gao	bd47a00f27	[Fix] use sympy only when necessary (#255 )	2023-08-24 10:15:20 +08:00
Tong Gao	01372a4806	update (#251 )	2023-08-23 16:25:23 +08:00
Yixiao Fang	1034c487ef	[Refactor] Refactor instructblip (#227 ) * refactor instructblip * add post processor * add forward * fix lint * update * update	2023-08-23 15:33:59 +08:00
liushz	02ce139bc6	[Feature] Add Tree-of-Thought method (#173 ) * Add ToT method * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update chain_of_thought.md * Update icl_tot_inferencer.py --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2023-08-23 12:23:05 +08:00
Leymore	ff5ab92331	[Feature] Add llama2 native implements (#235 ) * add llama2 native implements * rename configs/eval_llama_7b.py --------- Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>	2023-08-23 11:33:25 +08:00
Leymore	fdc69f9d58	[Fix] local runner debug (#238 )	2023-08-21 16:58:36 +08:00
Yike Yuan	8d368d1cd6	[Feat] Support visualglm and llava for MMBench evaluation. (#211 ) * [Feat] Support visualglm inference on MMBench. * [Feat] Support llava inference on MMBench. * [Fix] Fix pre-commit format. * [Fix] Add docstring for llava * [Fix] Fix multi-process inference error of LlaVA and add comments. 1. Set `low_cpu_mem_usage` to False to address device issue. 2. Add docstring and type hints. 3. Rename class and remove registry. * [Fix] Pre-commit fix. * [Fix] add forward entry, add dynamic import to seedbench * [Fix] Fix pre-commit. * [Fix] Fix missing context. * [Fix] Fix docstring.	2023-08-21 15:57:30 +08:00
Yike Yuan	a6552224cb	[Feat] Support multi-modal evaluation on MME benchmark. (#197 ) * [Feat] Support multi-modal evaluation on MME benchmark. * [Fix] Remove debug code. * [Fix] Remove redundant codes and add type hints. * [Fix] Rename in config. * [Fix] Rebase main. * [Fix] Fix isort and yapf conflict.	2023-08-21 15:53:20 +08:00
philipwangOvO	3b29aaee2b	[Fix] bin_trim (#237 ) Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>	2023-08-21 15:44:49 +08:00
philipwangOvO	655a807f4b	[Dataset] LongBench (#236 ) Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>	2023-08-21 14:15:20 +08:00
Yixiao Fang	0fa2482661	[Feature] Support SEED-Bench (#203 ) * support seedbench * update docstrings * update * update * update * update according to review * rebase * fix lint * update	2023-08-17 17:24:02 +08:00
Ezra-Yu	17ccaa5980	[Feat] Add codegeex2 and Humanevalx (#210 ) * add codegeex2 * add humanevalx dataset * add evaluator * update evaluator * update configs * update clean code * update configs * fix lint * remove sleep * fix lint * update docs * fix lint	2023-08-17 11:03:16 +08:00
Hubert	0fe2366a72	[Feat] support adv_glue dataset for adversarial robustness (#205 ) * [Feat] support adv_glue dataset for adversarial robustness * reorg files * minor fix * minor fix	2023-08-16 18:42:06 +08:00
Yuan Liu	78df9bd0cb	[Feature]: Add other public datasets (#206 ) * [Feature]: Refactor class name * [Feature]: Add minigpt-4 coco caption * [Feature]: Update minigpt-4 coco caption * [Feature]: Add MiniGPT-4 ScienceQA * [Feature]: Add minigpt-4 vqav2 * [Feature]: Add VSR * [Feature]: Revert task to previous version	2023-08-16 11:37:26 +08:00
Yike Yuan	3a46b6c64f	[Fix] Fix bugs of multiple rounds of inference when using mm_eval (#201 )	2023-08-16 11:15:11 +08:00
Hubert	7c393192af	[Fix] fix bug for postprocessor (#195 ) * [Fix] fix bug for postprocessor * minor fix	2023-08-11 18:41:12 +08:00
Tong Gao	10cbc2b175	Bump version to 0.1.2 (#190 )	2023-08-11 17:43:14 +08:00
Tong Gao	bf79ff1c6d	[Feature] Add LEval datasets Co-authored-by: kennymckormick <dhd@pku.edu.cn>	2023-08-11 17:38:31 +08:00
Hubert	8d9cee060f	[Feat] update postprocessor to get first option more accurately (#193 ) * [Feat] update postprocessor to get first option * minor fix * minor fix	2023-08-11 17:33:00 +08:00
Leymore	14332e08fd	[Feature] add llama-oriented dataset configs (#82 ) * add llama-oriented dataset configs * update * revert cvalues & update llama_example	2023-08-11 12:48:05 +08:00
Hubert	5a9539f375	[Feat] add safety to collections (#185 ) * [Feat] add safety to collections * minor fix	2023-08-11 11:19:26 +08:00
Zaida Zhou	f4c70ba6c3	[Feature] Support filtering specified levels message (#187 ) * Support filtering message * minor fix	2023-08-11 10:46:46 +08:00
Zaida Zhou	f256abffd3	[Enhancement] Skip invalid keys to avoid requesting API (#184 ) * Skip invalid keys to avoid requesting API * get expected key * print warning info	2023-08-10 18:41:43 +08:00
Ma Zerun	59bf56349c	[Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU (#148 ) * [Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU * Fix UT * Update according to comments	2023-08-10 16:53:03 +08:00
Tong Gao	312095de9d	[Fix] meta template & unit tests (#170 )	2023-08-10 16:49:13 +08:00
liushz	ed248af136	[Fix] Fix some sc errors (#177 ) * Update sc * Update sc doc * Apply suggestions from code review Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>	2023-08-10 16:40:32 +08:00
Tong Gao	2931f3dcb8	[Enhancement] Add humaneval postprocessor for GPT models & eval config for GPT4, enhance the original humaneval postprocessor (#129 ) * [Enhancement] Enhance humaneval postprocessor * add human-eval testcase * update * update --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-08-10 16:31:12 +08:00
Songyang Zhang	3f36db3b06	[Feature] Support turbomind (#166 ) * support turbomind * update doc * Update docs/en/advanced_guides/evaluation_turbomind.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * Update docs/zh_cn/advanced_guides/evaluation_turbomind.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * Update docs/zh_cn/advanced_guides/evaluation_turbomind.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * Update docs/en/advanced_guides/evaluation_turbomind.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * update --------- Co-authored-by: Tong Gao <gaotongxiao@gmail.com>	2023-08-10 16:25:11 +08:00
Leymore	e7fc54baf1	[Feature] Add Xiezhi SQuAD2.0 ANLI (#101 ) * add Xiezhi SQuAD2.0 ANLI; update WSC * update * update * update doc string	2023-08-10 14:04:18 +08:00
Yuan Liu	a205629ff3	[Feature]: Refactor input and output (#176 ) * [Feature]: Refactor input and output * [Feature]: Update tasks	2023-08-10 14:01:28 +08:00
Leymore	876ade71a5	[Fix] Fix AGIEval multiple choice (#137 ) * update agieval data * rename variables	2023-08-10 11:38:24 +08:00
Tong Gao	e6194df29e	[Fix] Use a copy of the config object in Task (#174 )	2023-08-09 15:24:49 +08:00
Haodong Duan	d5d4f47371	[API] Refine OpenAI (#175 )	2023-08-09 12:38:57 +08:00
Zaida Zhou	af436f5951	[Feature] Calculate max_out_len without hard code for OpenAI model (#158 ) * calulate max_out_len without hard code * set default value * update configs * Update configs/eval_gpt3.5.py Co-authored-by: Tong Gao <gaotongxiao@gmail.com> --------- Co-authored-by: Tong Gao <gaotongxiao@gmail.com>	2023-08-08 15:16:56 +08:00
Yuan Liu	2f1949e7a1	[Feature]: Add mm suport for local (#169 )	2023-08-08 14:21:58 +08:00
Tong Gao	bbdedc6c95	[Enhancement] Optimize OpenAI models (#128 ) * [Feature] Enhance OpenAI API, add example config for GPT evaluation	2023-08-03 14:55:16 +08:00
Haodong Duan	d17a5b94fa	[Refine] Refine PR #122 (#123 ) * update * update	2023-08-03 14:54:38 +08:00
Yuan Liu	191a3f6f9d	[Feature]: Use multimodal (#73 ) * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * [Feature]: Delete redundant file * [Feature]: Delete redundant file * [Feature]: Add README to InstructBLIP * [Feature]: Update MiniGPT-4 * [Fix]: Fix lint * [Feature]add omnibenchmark readme (#49) * add omnibenchmark readme * fix * Update OmniMMBench.md * Update OmniMMBench.md * Update OmniMMBench.md * [Fix]: Refine name (#54) * [Feature]: Unify out and err * [Fix]: Fix lint * [Feature]: Rename to mmbench and change weight path * [Feature]: Delete Omni in instructblip * [Feature]: Check the avaliablity of lavis * [Fix]: Fix lint * [Feature]: Refactor MM * [Refactor]: Refactor path * [Feature]: Delete redundant files * [Refactor]: Delete redundant files --------- Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>	2023-08-03 11:07:50 +08:00
Tong Gao	8b163bd8e9	[Feature] Several enhancements (#142 )	2023-08-01 18:19:49 +08:00
Tong Gao	c00179d46b	[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130 ) * [Feature] Support evaluating acc based on minimum edit distance, update SIQA * update	2023-08-01 14:24:27 +08:00
Leymore	d862f570aa	[Feature] Add SC (#126 ) * add self-consistency * add CoT method Self-Consistency * fix typo error and update openicl_eval * add tydiQA-GoldP task * fix sc * rename gsm8k_sc * fix sc * add self-consistency doc * refine sc --------- Authored-by: liushz <qq1791167085@163.com>	2023-07-28 17:29:37 +08:00
Haodong Duan	538b439302	[Fix] Fix seed in HFEvaluator (#122 )	2023-07-28 11:29:01 +08:00
Haodong Duan	46c9645753	[Feature] Allow explicitly setting the temperature for API model (#121 ) * allow explicitly setting the temperature * update	2023-07-28 11:28:15 +08:00
gowithme	57fcfc975a	[Feature] Support intern lanuage model (#51 ) * support internLM * support internLM * simplify intern model files * update storage_manager * support internLM * Modify the file organization structure * support internLM * support internLM * support internLM * support internLM * change some details	2023-07-27 18:49:36 +08:00
Hubert	b7184e9db5	[Refactor] Update crows-pairs evaluation (#98 ) * [Refactor] Update crows-pairs evaluation * [Refactor] Update crows-pairs evaluation * minor	2023-07-26 11:21:32 +08:00
Haonan Li	e9cdb24ddd	[Feature] Add CMMLU dataset (#91 ) * add CMMLU * debug cmmlu * add slurm args `qos` * fix format: space before comment * remove unused variable * change the location of `answer is` --------- Co-authored-by: 李浩楠 <lihaonan@lihaonandeMacBook-Air.local> Co-authored-by: 李浩楠 <haonan.li> Co-authored-by: Leymore <zfz-960727@163.com>	2023-07-25 10:14:27 +08:00
Haodong Duan	6e885d668b	force utf-8 encoding for all non-dataset fileios (#97 )	2023-07-25 10:06:01 +08:00
Leymore	3fe5ee096c	[Feature] Add heuristic size partitioner (#63 ) * [Feature] Add heuristic size partitioner * update	2023-07-20 11:53:24 +08:00
Leymore	eea8b04417	[Feature] Add llama-2 models (#81 ) * add llama-2 models * update docs --------- Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>	2023-07-19 19:51:29 +08:00
Hubert	f83e125e5a	[Feat] Support CValues Responsibility dataset (#78 ) * [Feat] support CValues * minor fix	2023-07-18 18:45:15 +08:00
LZH	26e2f171f4	[Feature] Support load PEFT adapter for HuggingFace model (#74 ) * support peft for HuggingFace model * add docstring	2023-07-18 16:21:43 +08:00
liushz	f36c0496f3	[Feature] Add tydiqa-goldp (#75 ) Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2023-07-18 14:54:35 +08:00
Tong Gao	311bf0daa7	[Fix] Fix CI (#70 ) * [Fix] Fix CI * [Fix] Fix CI * [Fix] Fix CI * update	2023-07-17 19:10:59 +08:00
Tong Gao	29006e39c0	[Fix] Fix circular import of PromptTemplate (#71 )	2023-07-17 19:09:38 +08:00
Tong Gao	1e44541730	[Enhancement] Test linting in CI and fix existing linting errors (#69 ) * [Enhancement] Test linting in CI * fix linting	2023-07-17 15:59:10 +08:00
Leymore	1326aff77e	[Feature] Add logger info and remove dataset bugs (#61 ) * Add logger info and remove dataset bugs * fix typo	2023-07-17 14:26:30 +08:00
Tong Gao	7ee5a86fee	[Feature] Enhance OpenAI API, add example config for GPT evaluation (#53 ) * [Feature] Enhance OpenAI API, add example config for GPT evaluation * fix	2023-07-12 16:43:46 +08:00
Hubert	f5103f93dd	[Feat] add bs for perspective api eval (#50 ) * [Feat] add bs for perspective api eval * fix according to comments * fix according to comments	2023-07-12 16:26:01 +08:00
Hubert	c8f1d513b2	[Fix] fix clp inferencer (#44 )	2023-07-11 14:54:39 +08:00
Tong Gao	0625294e5f	[Fix] Fix OpenICLInferTask (#41 )	2023-07-10 16:12:01 +08:00
Ma Zerun	805293a9f2	Auto re-generate port number during retry (#24 ) * Auto re-generate port number during retry * Fix slurm command	2023-07-07 17:25:56 +08:00
Hubert	7f8eee4725	[Docs] add en docs (#15 ) * add en docs * update --------- Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>	2023-07-06 12:58:44 +08:00
Leymore	86d5ec3d0f	Update configs (#9 ) * Update implements * Update	2023-07-06 12:27:41 +08:00
Hubert	5c19c8c5fc	[Docs] add issue and pr template (#12 ) * [Feat] add issue and pr template * minor add utils * minor fix	2023-07-06 11:55:01 +08:00
Tong Gao	986a44cedd	Tranlate lark messages (#5 )	2023-07-05 18:40:05 +08:00
Tong Gao	719ba34d1b	[Enhancement] Update prompt hash computation (#2 )	2023-07-05 18:29:07 +08:00
Ma Zerun	5840c7655c	Update start guide (#4 )	2023-07-05 18:26:26 +08:00
yuzhaohui	dcf11cf8fd	New logo and update setup.py	2023-07-05 06:54:06 +00:00
mzr1996	04dd01a235	Update configs and code	2023-07-05 11:45:08 +08:00
Leymore	c94cc94348	Add release contribution	2023-07-05 03:15:31 +00:00
tonysy	e6b5bdcb87	OpenCompass Public MR	2023-07-05 03:15:21 +00:00
Ezra-Yu	cbe9fe2cdb	Add Release Contraibution	2023-07-05 02:22:40 +00:00
cky	36f111100f	update datasets	2023-07-05 01:45:26 +00:00
mzr1996	3cfe73de3f	Support a batch of datasets.	2023-07-05 01:30:27 +00:00
kennymckormick	78478e961e	[Code] Update opencompass/datasets/agieval/__init__.py	2023-07-05 00:28:07 +00:00
yingfhu	fb11108723	[Feat] support opencompass	2023-07-04 22:11:33 +08:00
gaotongxiao	7d346000bb	initial commit	2023-07-04 21:34:55 +08:00

1 2 3

144 Commits