OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Songyang Zhang	aadcfa625f	[Feat] Update owners for issues (#1293 ) * [Feat] Update owners for issues * update owners --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2024-07-05 18:27:30 +08:00
Songyang Zhang	409a042d93	[Feature] Add InternLM2.5 (#1286 ) * [Feature] Add InternLM2.5 * Update * update readme --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2024-07-04 20:10:31 +08:00
zhulinJulia24	167cfdcca3	[ci] update daily testcase (#1285 ) * Update daily-run-test.yml * Create eval_regression_chat.py * Delete .github/scripts/.github/scripts/eval_regression_chat.py * Create eval_regression_chat.py * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml * Update oc_score_assert.py * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml * Update oc_score_assert.py * Update oc_score_assert.py * fix lint * update * update * update * update * update * update * update * update * update * Update daily-run-test.yml * update --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-07-03 18:56:09 +08:00
baymax591	28eba6fe34	npu适配 (#1250 ) * npu适配 * Add suport for Ascend NPU * format --------- Co-authored-by: baymax591 <14428251+baymax591@user.noreply.gitee.com> Co-authored-by: Leymore <zfz-960727@163.com>	2024-07-03 18:55:19 +08:00
liushz	fc2c9dea8c	Update MathBench summarizer & fix cot setting (#1282 ) * Update MathBench * Update MathBench * Update MathBench --------- Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>	2024-07-01 21:51:17 +08:00
Fengzhe Zhou	a32f21a356	[Sync] Sync with internal codes 2024.06.28 (#1279 )	2024-06-28 14:16:34 +08:00
Xingyuan Bu	842fb1cd70	Update mtbench101.py (#1276 ) fix wrong-used import from torch.utils.data import DataLoader, Dataset	2024-06-26 00:40:22 +08:00
zhulinJulia24	26d077b080	flash attn installation in daily testcase (#1272 ) * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml	2024-06-24 18:22:46 +08:00
liushz	e5ee1647fb	Add doc for accelerator function (#1252 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench * Update accelerator * Add Doc for accelerator * Add Doc for accelerator * Add Doc for accelerator * Add Doc for accelerator --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-06-24 14:53:51 +08:00
klein	1fa62c4a42	Support wildbench (#1266 ) Co-authored-by: Leymore <zfz-960727@163.com>	2024-06-24 13:16:27 +08:00
LIU Xiao	83b9fd9eaa	add ",<2.0.0" to "numpy>=1.23.4" in requirements/runtime.txt, as pandas<2.0.0 doesn't compatible with numpy>=2.0.0 (#1267 )	2024-06-24 11:03:42 +08:00
bittersweet1999	e0d7808b4e	[Fix] fix pip version (#1228 ) * fix pip version * fix pip version	2024-06-06 11:48:07 +08:00
bittersweet1999	982e024540	[Feature] add dataset Fofo (#1224 ) * add fofo dataset * add dataset fofo	2024-06-06 11:40:48 +08:00
Xingyuan Bu	02a0a4e857	MT-Bench-101 (#1215 ) * add mt-bench-101 * add readme and requirements * add mt-bench-101 data * Update readme_mtbench101.md * update readme * update leaderboard * fix typo * Update readme_mtbench101.md * fit newest opencompass * update readme.md * mtbench101 to opencompass * mtbench101 to opencompass * for code review * for code review * for code review * hook * hook --------- Co-authored-by: liujie <ljie@buaa.edu.cn>	2024-06-03 14:52:12 +08:00
mqy004	b272803d8a	解决release版本安装后不能导入opencompass.cli.main的问题 (#1221 ) * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * format --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-31 13:23:33 +08:00
bittersweet1999	7c381e5be8	[Fix] fix summarizer (#1217 ) * fix summarizer * fix summarizer	2024-05-31 11:40:47 +08:00
Fengzhe Zhou	a77b8a5cec	[Sync] format (#1214 )	2024-05-30 00:21:58 +08:00
Fengzhe Zhou	d59189b87f	[Doc] Update running command in README (#1206 )	2024-05-30 00:06:39 +08:00
Fengzhe Zhou	0b50112dc1	[Fix] Rollback opt model configs (#1213 )	2024-05-30 00:03:22 +08:00
Fengzhe Zhou	d656e818f8	[Docs] Remove --no-batch-padding and Use --hf-num-gpus (#1205 ) * [Docs] Remove --no-batch-padding and Use -hf-num-gpus * update	2024-05-29 16:30:10 +08:00
Xu Song	808582d952	Fix VLLM argument error (#1207 )	2024-05-29 10:14:08 +08:00
Fengzhe Zhou	2954913d9b	[Sync] bump version (#1204 )	2024-05-28 23:09:59 +08:00
liushz	ba620c4afe	Update accelerator (#1195 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench * Update accelerator --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-05-28 17:17:54 +08:00
Fengzhe Zhou	9fa80b0f93	[Feat] Update charm summary (#1194 )	2024-05-27 16:17:01 +08:00
jxd	608ff5810d	support CHARM (https://github.com/opendatalab/CHARM ) reasoning tasks (#1190 ) * support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks * fix lint error * add dataset card for CHARM * minor refactor * add txt --------- Co-authored-by: wujiang <wujiang@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-27 13:48:22 +08:00
bittersweet1999	07a6dacf33	fix length (#1180 )	2024-05-24 23:30:01 +08:00
bittersweet1999	88c14d3d04	add support for lmdeploy api judge (#1193 )	2024-05-24 23:28:56 +08:00
yaoyingyy	749e4cea71	[Fix] temporary files using tempfile (#1186 ) Co-authored-by: yaoying <yaoying@kingsoft.com>	2024-05-24 23:27:37 +08:00
klein	5eb8f14d97	[Fix] Fix drop_gen.py (#1191 ) Fix the bug in drop_gen: wrong import	2024-05-24 23:17:50 +08:00
bittersweet1999	31afe87026	fix yi-chat template (#1178 )	2024-05-21 18:14:12 +08:00
liushz	1448be00e2	Update MathBench (#1176 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-05-21 14:45:43 +08:00
Fengzhe Zhou	2b3d4150f3	[Sync] update evaluator (#1175 )	2024-05-21 14:22:46 +08:00
zhulinJulia24	296ea59931	Update daily-run-test.yml (#1173 )	2024-05-20 14:04:58 +08:00
Fengzhe Zhou	5de85406ce	[Sync] add OC16 entry (#1171 )	2024-05-17 16:50:58 +08:00
zhulinJulia24	94eb90569f	update test workflow (#1167 ) * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml * Update daily-run-test.yml * Update oc_score_assert.py --------- Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>	2024-05-16 15:32:57 +08:00
Fengzhe Zhou	8ea2c404d7	[Feat] enable HuggingFacewithChatTemplate with --accelerator via cli (#1163 ) * enable HuggingFacewithChatTemplate with --accelerator via cli * rm vllm_internlm2_chat_7b	2024-05-15 21:51:07 +08:00
liushz	e3c0448bbc	Update accelerator (#1152 ) * Update acclerator * update run --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: Fengzhe Zhou <zfz-960727@163.com>	2024-05-15 14:31:47 +08:00
Fengzhe Zhou	f10dd48f9c	[Fix] Update stop_words in huggingface_above_v4_33 (#1160 )	2024-05-15 14:10:33 +08:00
Fengzhe Zhou	80f831b425	[Fix] use ProcessPoolExecutor during mbpp eval (#1159 )	2024-05-15 13:48:29 +08:00
bittersweet1999	8a8987be0b	fix arenahard summarizer (#1154 ) Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-15 13:31:29 +08:00
Fengzhe Zhou	62dbf04708	[Sync] update github workflow (#1156 )	2024-05-14 22:42:23 +08:00
Fengzhe Zhou	aa2dd2b58c	[Format] Add config lints (#892 )	2024-05-14 15:35:58 +08:00
Xu Song	3dbba11945	[Feat] Support dataset_suffix check for mixed configs (#973 ) * [Feat] Support dataset_suffix check for mixed configs * update mixed suffix * update suffix --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-14 15:03:28 +08:00
Fengzhe Zhou	7505b3cadf	[Feature] Add huggingface apply_chat_template (#1098 ) * add TheoremQA with 5-shot * add huggingface_above_v4_33 classes * use num_worker partitioner in cli * update theoremqa * update TheoremQA * add TheoremQA * rename theoremqa -> TheoremQA * update TheoremQA output path * rewrite many model configs * update huggingface * further update * refine configs * update configs * update configs * add configs/eval_llama3_instruct.py * add summarizer multi faceted * update bbh datasets * update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py * rename class * update readme * update hf above v4.33	2024-05-14 14:50:16 +08:00
Mo Li	6c711cb262	[Fix] Fix Needlebench Summarizer (#1143 ) * update few-shot example * add 128k	2024-05-13 15:59:34 +08:00
bittersweet1999	5432dfc1ff	fix multiround (#1146 )	2024-05-13 15:58:39 +08:00
bittersweet1999	833a35140b	[Fix] fix alpacaeval while add caching path (#1139 ) * fix alpacaeval * fix alpacaeval	2024-05-11 14:02:26 +08:00
Fengzhe Zhou	19d7e630d6	[Sync] Update accelerator (#1122 ) (cherry picked from commit 4beb6d9ab655d8a626971841b7acfd9fae9d438f) Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-05-09 14:32:31 +08:00
Alexander Lam	a71122ee18	[Feature] Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs (#1123 ) * added qwen moe and mixtral 8x22 model configs * updated README files news section	2024-05-09 11:04:26 +08:00
Mo Li	cb080fa7de	[Fix] Fix NeedleBench Summarizer Typo (#1125 ) * update needleinahaystack eval docs * update needlebench summarizer * fix english docs typo	2024-05-08 20:00:15 +08:00

1 2 3 4 5 ...

579 Commits