OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Fengzhe Zhou	9afbfa3639	[Sync] Fix TEvalEvaluator (#929 )	2024-02-28 16:05:30 +08:00
bittersweet1999	2ee8e8a1a1	[Feature] add mtbench (#829 ) * add mtbench * add mtbench * Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update opencompass/datasets/subjective/__init__.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * Update opencompass/datasets/subjective/mtbench.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * fix mtbench --------- Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>	2024-01-24 12:11:47 +08:00
Fengzhe Zhou	32f40a8f83	[Sync] Sync with internal codes 2023.01.08 (#777 )	2024-01-08 14:07:24 +00:00
Fengzhe Zhou	3a68083ecc	[Sync] update configs (#734 )	2023-12-25 21:59:16 +08:00
bittersweet1999	1fe152b3e8	[Feature] Support AlignmentBench infer and judge (#697 ) * alignmentbench infer and judge * alignmentbench * alignmentbench done * alignment all done * alignment all done	2023-12-13 19:59:30 +08:00
bittersweet1999	6130394165	[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692 ) * add features * add doc string * add doc string	2023-12-12 20:58:17 +08:00
bittersweet1999	465308e430	[Feature] Add Subjective Evaluation (#680 ) * new version of subject * fixed draw * fixed draw * fixed draw * done * done * done * done * fixed lint	2023-12-11 22:22:11 +08:00
Hubert	e78857ac36	[Sync] minor test (#683 )	2023-12-11 17:42:53 +08:00
liyucheng09	05bbce8b08	[Feature] Add Data Contamination Analysis (#639 ) * add contamination analysis to ceval * fix bugs * add contamination docs * to pass CI check * update --------- Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2023-12-08 10:00:11 +08:00
Fengzhe Zhou	d3de5c41fb	[Sync] update model configs (#574 )	2023-11-13 15:15:34 +08:00
Fengzhe Zhou	dbb20b8270	[Sync] update (#517 )	2023-10-27 20:31:22 +08:00
liushz	2737249f31	[Feature] Add mathbench dataset and circular evaluator (#408 ) * add_mathbench * update mathbench * support non circular eval dataset --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: yingfhu <yingfhu@gmail.com>	2023-10-18 04:08:31 -05:00
Leymore	fbf5089c40	[Sync] update github token (#475 )	2023-10-13 06:50:54 -05:00
Leymore	362c33dff4	fix jieba rouge (#467 )	2023-10-12 10:25:19 +08:00
Leymore	d7ff933a73	[Fix] Use jieba rouge in lcsts (#459 ) * use jieba rouge in lcsts * use rouge_chinese	2023-10-09 10:10:33 +08:00
Tong Gao	a1ea3c094a	[Sync] Initial support of subjective evaluation (#421 ) Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 15:42:31 +08:00
Ma Zerun	0f2c388280	Support GSM8k evaluation with tools by Lagent and LangChain (#277 ) * Support GSM8k evaluation with tools by Lagent and LangChain * Avoid to use MMEngine new feature * update document --------- Co-authored-by: Leymore <zfz-960727@163.com>	2023-09-22 15:28:22 +08:00
Leymore	ae0cd8752f	[Feature] Use local accuracy from hf implements (#416 ) * use local accuracy from hf implements * add load from hf fallback	2023-09-20 16:35:22 +08:00
Haodong Duan	d17a5b94fa	[Refine] Refine PR #122 (#123 ) * update * update	2023-08-03 14:54:38 +08:00
Yuan Liu	191a3f6f9d	[Feature]: Use multimodal (#73 ) * [Feature]: Add minigpt-4 * [Feature]: Add mm local runner * [Feature]: Add instructblip * [Feature]: Delete redundant file * [Feature]: Delete redundant file * [Feature]: Add README to InstructBLIP * [Feature]: Update MiniGPT-4 * [Fix]: Fix lint * [Feature]add omnibenchmark readme (#49) * add omnibenchmark readme * fix * Update OmniMMBench.md * Update OmniMMBench.md * Update OmniMMBench.md * [Fix]: Refine name (#54) * [Feature]: Unify out and err * [Fix]: Fix lint * [Feature]: Rename to mmbench and change weight path * [Feature]: Delete Omni in instructblip * [Feature]: Check the avaliablity of lavis * [Fix]: Fix lint * [Feature]: Refactor MM * [Refactor]: Refactor path * [Feature]: Delete redundant files * [Refactor]: Delete redundant files --------- Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>	2023-08-03 11:07:50 +08:00
Tong Gao	c00179d46b	[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130 ) * [Feature] Support evaluating acc based on minimum edit distance, update SIQA * update	2023-08-01 14:24:27 +08:00
Haodong Duan	538b439302	[Fix] Fix seed in HFEvaluator (#122 )	2023-07-28 11:29:01 +08:00
Tong Gao	311bf0daa7	[Fix] Fix CI (#70 ) * [Fix] Fix CI * [Fix] Fix CI * [Fix] Fix CI * update	2023-07-17 19:10:59 +08:00
Tong Gao	1e44541730	[Enhancement] Test linting in CI and fix existing linting errors (#69 ) * [Enhancement] Test linting in CI * fix linting	2023-07-17 15:59:10 +08:00
Hubert	f5103f93dd	[Feat] add bs for perspective api eval (#50 ) * [Feat] add bs for perspective api eval * fix according to comments * fix according to comments	2023-07-12 16:26:01 +08:00
Leymore	86d5ec3d0f	Update configs (#9 ) * Update implements * Update	2023-07-06 12:27:41 +08:00
Ezra-Yu	cbe9fe2cdb	Add Release Contraibution	2023-07-05 02:22:40 +00:00
cky	36f111100f	update datasets	2023-07-05 01:45:26 +00:00
yingfhu	fb11108723	[Feat] support opencompass	2023-07-04 22:11:33 +08:00
gaotongxiao	7d346000bb	initial commit	2023-07-04 21:34:55 +08:00

30 Commits