OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Linchen Xiao	f04f3546bc	[Fix] Import fix (#1500 )	2024-09-06 18:29:24 +08:00
Linchen Xiao	ff18545f0e	[Bump] Bump version to 0.3.2 (#1497 )	2024-09-06 16:10:45 +08:00
Linchen Xiao	87ffa71d68	[Feature] Longbench dataset update	2024-09-06 15:50:12 +08:00
Albert Yan	928d0cfc3a	[Feature] Add support for Rendu API (#1468 ) * Add support for Rendu API * fix lint issue * fix lint issue * fix lint issue * Update --------- Co-authored-by: 13190 <zeyu.yan@transn.com> Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>	2024-09-06 01:00:43 +08:00
Hari Seldon	faf5260155	[Feature] Optimize Evaluation Speed of SciCode (#1489 ) * update scicode * update comments * remove redundant variable * Update --------- Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>	2024-09-06 00:59:41 +08:00
liushz	00fc8da5be	[Feature] Add model postprocess function (#1484 ) * Add model postprocess function * Add model postprocess function * Add model postprocess function * Add model postprocess function * Add model postprocess function * Add model postprocess function * Add model postprocess function * Add model postprocess function --------- Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>	2024-09-05 21:10:29 +08:00
Maxime SHE	45efdc994d	[Feature] Add an attribute api_key into TurboMindAPIModel default None (#1475 ) Co-authored-by: Maxime <maximeshe@163.com> Add an attribute api_key into TurboMindAPIModel default None then we can set the api_key while using lmdeploy to deploy the llm model	2024-09-05 17:51:16 +08:00
Linchen Xiao	6c9cd9a260	[Feature] Needlebench auto-download update (#1480 ) * update * update * update	2024-09-05 17:22:42 +08:00
zhulinJulia24	716d46e1f5	[ci] fix badcase and add env info (#1491 ) * update * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-09-05 16:43:45 +08:00
zhulinJulia24	fb6a0df652	[ci] fix test env for vllm and add vllm baselines (#1481 ) * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-09-04 19:24:09 +08:00
Linchen Xiao	da74cbfa39	[Fix] Model configs update	2024-09-04 18:57:10 +08:00
Linchen Xiao	95aad6c282	[Fix] Requirements update	2024-09-03 18:50:40 +08:00
Linchen Xiao	9693be46b7	[Feature] Mmlu-pro auto-download (#1464 ) * update * update * update * update * update	2024-08-30 10:03:40 +08:00
zhulinJulia24	f34209766d	[ci] fix test env (#1470 ) * Update daily-run-test.yml * Update daily-run-test.yml * Update pr-run-test.yml * Update daily-run-test.yml * update * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-08-29 14:48:17 +08:00
Alexander Lam	8b39225259	[Feature] Added `extra_body` support for OpenAISDK; Added support for proxy URL when connecting to OpenAI's API. (#1467 ) * fix lint issues * fix lint issues	2024-08-29 00:43:43 +08:00
Guoli Yin	a488b9b4f5	[Feature] Make OPENAI_API_BASE compatible with openai default env (#1461 ) * Make OPENAI_API_BASE compatible with openai default env * Make OPENAI_API_BASE compatible with openai default env --------- Co-authored-by: Guoli Yin <gyin@icloud.com>	2024-08-28 23:14:41 +08:00
Songyang Zhang	e5a8eb2283	[Feature] Update Lint and Leaderboard (#1458 ) * [Feature] Update Lint and Leaderboard * Update * Update	2024-08-28 22:36:42 +08:00
Linchen Xiao	245664f4c0	[Feature] Fullbench v0.1 language update (#1463 ) * update * update * update * update	2024-08-28 14:01:05 +08:00
CHEN PENGAN	463231c651	[Feature] Add icl_sliding_k_retriever.py and update __init__.py (#1305 ) * Add icl_sliding_k_retriever.py and update __init__.py * Fix flake8, isort, and yapf issues for Sliding Window Retriever	2024-08-23 17:18:31 +08:00
Linchen Xiao	94b6bd65fc	[Fix] Fix cli evaluation for multiple models (#1454 ) * update * update	2024-08-23 17:15:36 +08:00
Linchen Xiao	2295a33a18	[Doc] Update readme (#1453 )	2024-08-23 14:11:01 +08:00
Songyang Zhang	5485207fbe	[Bump] Bump version to 0.3.1 (#1450 ) * [Bump] Bump version 0.3.1 * Update	2024-08-23 10:47:57 +08:00
Songyang Zhang	7c2d25b557	[Fix] Update SciCode and Gemma model (#1449 ) * [Fix] Update SciCode and Gemma model * Update * Update	2024-08-23 10:42:27 +08:00
Xu Song	ad3931aa32	Update openicl_infer.py (#1308 )	2024-08-23 10:39:22 +08:00
zhulinJulia24	fb69ba5eb8	[CI] add commond testcase into daily testcase (#1447 ) * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-08-23 01:49:17 +08:00
liushz	9fdbc744dc	[Fix] Update option postprocess & mathbench language summarizer (#1413 ) * Update option postprocess & mathbench language summarizer * Update option postprocess & mathbench language summarizer --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-08-22 14:49:07 +08:00
Linchen Xiao	0fe9756c5d	[Doc] Update Readme (#1439 ) * update * update * update * update * update * update * update * update * update * update * update * update	2024-08-22 14:48:45 +08:00
Hari Seldon	14b4b735cb	[Feature] Add support for SciCode (#1417 ) * add SciCode * add SciCode * add SciCode * add SciCode * add SciCode * add SciCode * add SciCode * add SciCode w/ bg * add scicode * Update README.md * Update README.md * Delete configs/eval_SciCode.py * rename * 1 * rename * Update README.md * Update scicode.py * Update scicode.py * fix some bugs * Update * Update --------- Co-authored-by: root <HariSeldon0> Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>	2024-08-22 13:42:25 +08:00
liushz	d3963bceae	[Bug] Add model support for 'huggingface_above_v4_33' when using '-a' (#1430 ) Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-08-22 13:40:24 +08:00
seetimee	ac093fce53	[Update] Update openai_api.py (#1438 ) Most models' token limits are above 32k. It will fix long context dataset test bug of skiping some data.	2024-08-21 18:57:49 +08:00
liushz	e076dc5acf	[Fix] Fix openai api tiktoken bug for api server (#1433 ) * Fix openai api tiktoken * Fix openai api tiktoken --------- Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>	2024-08-20 22:02:14 +08:00
Linchen Xiao	a4b54048ae	[Feature] Add Ruler datasets (#1310 ) * [Feature] Add Ruler datasets * pre-commit fixed * Add model specific tokenizer to dataset * pre-commit modified * remove unused import * fix linting * add trust_remote to tokenizer load * lint fix * comments resolved * fix lint * Add readme * Fix lint * ruler refactorize * fix lint * lint fix * updated * lint fix * fix wonderwords import issue * prompt modified * update * readme updated * update * ruler dataset added * Update --------- Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>	2024-08-20 11:40:11 +08:00
Xu Song	99b5122ed5	[Feature] Add abbr for rolebench dataset (#1431 ) * Add abbr for rolebench dataset * add	2024-08-20 11:22:48 +08:00
Linchen Xiao	ecf9bb3e4c	[Bug] Commonsenseqa dataset fix (#1425 ) * longbench dataset load fix * update * Update * Update * Update * update * update --------- Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>	2024-08-16 15:54:07 +08:00
Songyang Zhang	9b3613f10b	[Update] Support auto-download of FOFO/MT-Bench-101 (#1423 ) * [Update] Support auto-download of FOFO/MT-Bench-101 * Update wildbench	2024-08-16 11:57:41 +08:00
bittersweet1999	ce7f4853ce	[Fix] Sub summarizer order fix (#1426 ) * fix pip version * fix pip version * fix sub summarizer order * fix order	2024-08-15 21:08:18 +08:00
Linchen Xiao	2596f226f4	[Fix] longbench dataset load fix (#1422 )	2024-08-15 11:30:30 +08:00
Linchen Xiao	8e55c9c6ee	[Update] Compassbench v1.3 (#1396 ) * stash files * compassbench subjective evaluation added * evaluation update * fix lint * update docs * Update lint * changes saved * changes saved * CompassBench subjective summarizer added (#1349) * subjective summarizer added * fix lint [Fix] Fix MathBench (#1351) Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> [Update] Update model support list (#1353) * fix pip version * fix pip version * update model support subjective summarizer updated knowledge, math objective done (data need update) remove secrets objective changes saved knowledge data added * secrets removed * changed added * summarizer modified * summarizer modified * compassbench coding added * fix lint * objective summarizer updated * compass_bench_v1.3 updated * update files in config folder * remove unused model * lcbench modified * removed model evaluation configs * remove duplicated sdk implementation --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>	2024-08-12 19:09:19 +08:00
changyeyu	59586a8b4a	[Feature] Enable Truncation of Mid-Section for Long Prompts in `huggingface_above_v4_33.py` (#1373 ) * Retain the first and last halves of the tokens from the prompt, discarding the middle, to avoid exceeding the model's maximum length. * Add default parameter: mode * Modified a comment. * Modified variable names. * fix yapf lint	2024-08-09 11:36:30 +08:00
Songyang Zhang	88eb91219b	[Doc] Update README (#1404 ) * [Doc] Update README * Update	2024-08-08 16:18:33 +08:00
yaoyingyy	decb621ff6	[Fix] the issue where scores are negative in the Lawbench dataset evaluation(#1402 ) (#1403 )	2024-08-08 16:08:26 +08:00
Yunlin Mao	818d72a650	[Fix] modelscope dataset load problem (#1406 ) * fix modelscope dataset load * fix lint	2024-08-08 14:01:06 +08:00
Songyang Zhang	264fd23129	[Bump] Bump version for v0.3.0 (#1398 )	2024-08-07 01:25:24 +08:00
Songyang Zhang	fed1a4998b	[Fix] Fix CaLM import (#1395 )	2024-08-06 12:17:45 +08:00
Songyang Zhang	c81329b548	[Fix] Fix Slurm ENV (#1392 ) 1. Support Slurm Cluster 2. Support automatic data download 3. Update InternLM2.5-1.8B/20B-Chat	2024-08-06 01:35:20 +08:00
Songyang Zhang	c09fc79ba8	[Feature] Support OpenAI ChatCompletion (#1389 ) * [Feature] Support import configs/models/summarizers from whl * Update * Update openai sdk * Update * Update gemma	2024-08-01 19:10:13 +08:00
Peng Bo	07c96ac659	Calm dataset (#1385 ) * Add CALM Dataset	2024-08-01 10:03:21 +08:00
Songyang Zhang	46cc7894e1	[Feature] Support import configs/models/summarizers from whl (#1376 ) * [Feature] Support import configs/models/summarizers from whl * Update LCBench configs * Update * Update * Update * Update * update * Update * Update * Update * Update * Update	2024-08-01 00:42:48 +08:00
Mo Li	b83396f57c	add 1m config (#1383 )	2024-07-31 14:53:51 +08:00
klein	52eccc4f0e	[Fix] Fix version mismatch of CIBench (#1380 ) * update crb * update crbbench * update crbbench * update crbbench * minor update wildbench * [Fix] Update doc of wildbench, and merge wildbench into subjective * [Fix] Update doc of wildbench, and merge wildbench into subjective, fix crbbench * Update crb.md * Update crb_pair_judge.py * Update crb_single_judge.py * Update subjective_evaluation.md * Update openai_api.py * [Update] update wildbench readme * [Update] update wildbench readme * [Update] update wildbench readme, remove crb * Delete configs/eval_subjective_wildbench_pair.py * Delete configs/eval_subjective_wildbench_single.py * Update __init__.py * [Fix] fix version mismatch for CIBench * [Fix] fix version mismatch for CIBench, local runer * [Fix] fix version mismatch for CIBench, local runer, remove oracle mode --------- Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>	2024-07-30 17:51:24 +08:00

1 2 3 4 5 ...

667 Commits