OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Hubert	4aa74565e2	[Feat] minor update agent related (#839 ) * [Feat] update cibench * [Feat] Support CIBench * [Feat] Support CIBench * [Feat] Support CIBench * [Feat] Support CIBench	2024-01-26 14:15:51 +08:00
bittersweet1999	2ee8e8a1a1	[Feature] add mtbench (#829 ) * add mtbench * add mtbench * Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update opencompass/datasets/subjective/__init__.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update opencompass/datasets/subjective/mtbench.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * fix mtbench --------- Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>	2024-01-24 12:11:47 +08:00
Fengzhe Zhou	b4afe3e7c1	[Sync] Add InternLM2 Keyset Evaluation Demo (#807 ) Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn>	2024-01-17 13:48:12 +08:00
Fengzhe Zhou	32f40a8f83	[Sync] Sync with internal codes 2023.01.08 (#777 )	2024-01-08 14:07:24 +00:00
bittersweet1999	be369c3e06	[Feature] Add multi_round dataset evaluation (#766 ) * multi_round dataset * add multi_round evaluation	2024-01-04 10:37:52 +00:00
Fengzhe Zhou	3a68083ecc	[Sync] update configs (#734 )	2023-12-25 21:59:16 +08:00
bittersweet1999	1fe152b3e8	[Feature] Support AlignmentBench infer and judge (#697 ) * alignmentbench infer and judge * alignmentbench * alignmentbench done * alignment all done * alignment all done	2023-12-13 19:59:30 +08:00
bittersweet1999	6130394165	[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692 ) * add features * add doc string * add doc string	2023-12-12 20:58:17 +08:00
bittersweet1999	465308e430	[Feature] Add Subjective Evaluation (#680 ) * new version of subject * fixed draw * fixed draw * fixed draw * done * done * done * done * fixed lint	2023-12-11 22:22:11 +08:00
Hubert	e78857ac36	[Sync] minor test (#683 )	2023-12-11 17:42:53 +08:00
liyucheng09	05bbce8b08	[Feature] Add Data Contamination Analysis (#639 ) * add contamination analysis to ceval * fix bugs * add contamination docs * to pass CI check * update --------- Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2023-12-08 10:00:11 +08:00
Ma Zerun	6aaf3b91ec	[Feature] Support chat style inferencer. (#643 ) * [Feature] Support chat style inferencer. * [Fix] use new prompt * [Fix] use new prompt --------- Co-authored-by: yingfhu <yingfhu@gmail.com>	2023-11-30 14:00:06 +08:00
Fengzhe Zhou	d4d1330a5a	[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes (#625 )	2023-11-23 14:05:59 +08:00
Fengzhe Zhou	fb30b7c7a2	[Fix] Fix gen inferencer (#615 )	2023-11-22 12:04:31 +08:00
Songyang Zhang	721a45c68f	[Bug] Update api with generation_kargs (#614 ) * update api * update generation_kwargs impl --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-11-22 10:02:57 +08:00
Hubert	91fba2c2e9	[Feat] support humaneval and mbpp pass@k (#598 ) * [Feat] support pass@ k * [Feat] support pass@k * [Feat] support pass@k * [Feat] support pass@k * [Feat] support pass@k * [Feat] support pass@k docs * update naming --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-11-16 21:22:06 +08:00
Hubert	fcab30f82e	[Fix] change save_every defaults to 1 (#592 )	2023-11-15 13:00:25 +08:00
Fengzhe Zhou	d3de5c41fb	[Sync] update model configs (#574 )	2023-11-13 15:15:34 +08:00
Hubert	bb2ecf416e	[Feat] Support cibench (#538 ) * [Feat] support cidataset * [Feat] support cidataset * [Feat] support cidataset * [Feat] support cidataset * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * rename cibench * rename cibench * rename cibench * rename cibench * minor fix * minor fix * minor fix	2023-11-07 19:11:44 +08:00
Songyang Zhang	239c2a346e	[Feature] Add support for MiniMax API (#548 ) * update requirement * update requirement * update with minimax * update api model * Update readme * fix error --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>	2023-11-06 21:57:32 +08:00
Fengzhe Zhou	dbb20b8270	[Sync] update (#517 )	2023-10-27 20:31:22 +08:00
Hubert	b3f5d9e421	[Feat] support math/gms8k agent config (#494 ) * support math agent * support gsm8k agent * support gsm8k agent * minor fix * minor fix * minor fix * Update configs/eval_codeagent.py	2023-10-25 23:05:15 +08:00
liushz	2737249f31	[Feature] Add mathbench dataset and circular evaluator (#408 ) * add_mathbench * update mathbench * support non circular eval dataset --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: yingfhu <yingfhu@gmail.com>	2023-10-18 04:08:31 -05:00
Leymore	fbf5089c40	[Sync] update github token (#475 )	2023-10-13 06:50:54 -05:00
Leymore	362c33dff4	fix jieba rouge (#467 )	2023-10-12 10:25:19 +08:00
Leymore	d7ff933a73	[Fix] Use jieba rouge in lcsts (#459 ) * use jieba rouge in lcsts * use rouge_chinese	2023-10-09 10:10:33 +08:00
Tong Gao	119bfd1569	[Refactor] Move fix_id_list to Retriever (#442 ) * [Refactor] Move fix_id_list to Retriever * update * move to base * fix	2023-10-07 12:53:41 +08:00
Hubert	d9f3e88dfe	[Fix] fix clp potential error and support bs>1 (#439 ) * [Fix] fix clp potential error and support bs>1 * [Fix] fix clp potential error and support bs>1 * minor fix * minor fix	2023-09-27 16:32:57 +08:00
Tong Gao	a1ea3c094a	[Sync] Initial support of subjective evaluation (#421 ) Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 15:42:31 +08:00
Ma Zerun	0f2c388280	Support GSM8k evaluation with tools by Lagent and LangChain (#277 ) * Support GSM8k evaluation with tools by Lagent and LangChain * Avoid to use MMEngine new feature * update document --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 15:28:22 +08:00
Tong Gao	681d3013de	[Feature] Log gold answer in prediction output (#419 ) * [Feature] Log gold answer in prediction output * support clp golden ans * minor fix --------- Co-authored-by: yingfhu <yingfhu@gmail.com>	2023-09-22 12:44:40 +08:00
Leymore	ae0cd8752f	[Feature] Use local accuracy from hf implements (#416 ) * use local accuracy from hf implements * add load from hf fallback	2023-09-20 16:35:22 +08:00
Hubert	a11cb45c83	[Feat] implementation for support promptbench (#239 ) * [Feat] support adv_glue dataset for adversarial robustness * reorg files * minor fix * minor fix * support prompt bench demo * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix * minor fix	2023-09-15 15:06:53 +08:00
cdpath	722eb39526	fix potential oom issue (#387 )	2023-09-12 10:41:03 +08:00
Leymore	880b34e759	[Fix] Quick lint fix (#362 ) * add default value * lint fix * use None	2023-09-06 14:33:13 +08:00
Leymore	b8bf16e81c	[Fix] zero retriever add default value (#361 )	2023-09-05 10:37:42 +08:00
Leymore	8774465a8f	[Enhancement] ignore ZeroRetriever error when id_list provided (#340 )	2023-09-04 11:12:16 +08:00
Leymore	e810974068	[Fix] Fix when missing both pad and eos token (#287 ) * fix when missing both pad and eos token * update pad_token_id impl	2023-08-31 16:53:39 +08:00
liushz	02ce139bc6	[Feature] Add Tree-of-Thought method (#173 ) * Add ToT method * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update ToT * Update chain_of_thought.md * Update icl_tot_inferencer.py --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2023-08-23 12:23:05 +08:00
Hubert	5a9539f375	[Feat] add safety to collections (#185 ) * [Feat] add safety to collections * minor fix	2023-08-11 11:19:26 +08:00
liushz	ed248af136	[Fix] Fix some sc errors (#177 ) * Update sc * Update sc doc * Apply suggestions from code review Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com> --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>	2023-08-10 16:40:32 +08:00
Haodong Duan	d17a5b94fa	[Refine] Refine PR #122 (#123 ) * update * update	2023-08-03 14:54:38 +08:00
Yuan Liu	191a3f6f9d	[Feature]: Use multimodal (#73 ) * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * [Feature]: Delete redundant file * [Feature]: Delete redundant file * [Feature]: Add README to InstructBLIP * [Feature]: Update MiniGPT-4 * [Fix]: Fix lint * [Feature]add omnibenchmark readme (#49) * add omnibenchmark readme * fix * Update OmniMMBench.md * Update OmniMMBench.md * Update OmniMMBench.md * [Fix]: Refine name (#54) * [Feature]: Unify out and err * [Fix]: Fix lint * [Feature]: Rename to mmbench and change weight path * [Feature]: Delete Omni in instructblip * [Feature]: Check the avaliablity of lavis * [Fix]: Fix lint * [Feature]: Refactor MM * [Refactor]: Refactor path * [Feature]: Delete redundant files * [Refactor]: Delete redundant files --------- Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>	2023-08-03 11:07:50 +08:00
Tong Gao	8b163bd8e9	[Feature] Several enhancements (#142 )	2023-08-01 18:19:49 +08:00
Tong Gao	c00179d46b	[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130 ) * [Feature] Support evaluating acc based on minimum edit distance, update SIQA * update	2023-08-01 14:24:27 +08:00
Leymore	d862f570aa	[Feature] Add SC (#126 ) * add self-consistency * add CoT method Self-Consistency * fix typo error and update openicl_eval * add tydiQA-GoldP task * fix sc * rename gsm8k_sc * fix sc * add self-consistency doc * refine sc --------- Authored-by: liushz <qq1791167085@163.com>	2023-07-28 17:29:37 +08:00
Haodong Duan	538b439302	[Fix] Fix seed in HFEvaluator (#122 )	2023-07-28 11:29:01 +08:00
Haodong Duan	6e885d668b	force utf-8 encoding for all non-dataset fileios (#97 )	2023-07-25 10:06:01 +08:00
Tong Gao	311bf0daa7	[Fix] Fix CI (#70 ) * [Fix] Fix CI * [Fix] Fix CI * [Fix] Fix CI * update	2023-07-17 19:10:59 +08:00
Tong Gao	29006e39c0	[Fix] Fix circular import of PromptTemplate (#71 )	2023-07-17 19:09:38 +08:00
Tong Gao	1e44541730	[Enhancement] Test linting in CI and fix existing linting errors (#69 ) * [Enhancement] Test linting in CI * fix linting	2023-07-17 15:59:10 +08:00
Hubert	f5103f93dd	[Feat] add bs for perspective api eval (#50 ) * [Feat] add bs for perspective api eval * fix according to comments * fix according to comments	2023-07-12 16:26:01 +08:00
Hubert	c8f1d513b2	[Fix] fix clp inferencer (#44 )	2023-07-11 14:54:39 +08:00
Leymore	86d5ec3d0f	Update configs (#9 ) * Update implements * Update	2023-07-06 12:27:41 +08:00
Leymore	c94cc94348	Add release contribution	2023-07-05 03:15:31 +00:00
tonysy	e6b5bdcb87	OpenCompass Public MR	2023-07-05 03:15:21 +00:00
Ezra-Yu	cbe9fe2cdb	Add Release Contraibution	2023-07-05 02:22:40 +00:00
cky	36f111100f	update datasets	2023-07-05 01:45:26 +00:00
mzr1996	3cfe73de3f	Support a batch of datasets.	2023-07-05 01:30:27 +00:00
yingfhu	fb11108723	[Feat] support opencompass	2023-07-04 22:11:33 +08:00
gaotongxiao	7d346000bb	initial commit	2023-07-04 21:34:55 +08:00

1 2 3

111 Commits