OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

Author	SHA1	Message	Date
Yunlin Mao	818d72a650	[Fix] modelscope dataset load problem (#1406 ) * fix modelscope dataset load * fix lint	2024-08-08 14:01:06 +08:00
Songyang Zhang	264fd23129	[Bump] Bump version for v0.3.0 (#1398 )	2024-08-07 01:25:24 +08:00
Songyang Zhang	fed1a4998b	[Fix] Fix CaLM import (#1395 )	2024-08-06 12:17:45 +08:00
Songyang Zhang	c81329b548	[Fix] Fix Slurm ENV (#1392 ) 1. Support Slurm Cluster 2. Support automatic data download 3. Update InternLM2.5-1.8B/20B-Chat	2024-08-06 01:35:20 +08:00
Songyang Zhang	c09fc79ba8	[Feature] Support OpenAI ChatCompletion (#1389 ) * [Feature] Support import configs/models/summarizers from whl * Update * Update openai sdk * Update * Update gemma	2024-08-01 19:10:13 +08:00
Peng Bo	07c96ac659	Calm dataset (#1385 ) * Add CALM Dataset	2024-08-01 10:03:21 +08:00
Songyang Zhang	46cc7894e1	[Feature] Support import configs/models/summarizers from whl (#1376 ) * [Feature] Support import configs/models/summarizers from whl * Update LCBench configs * Update * Update * Update * Update * update * Update * Update * Update * Update * Update	2024-08-01 00:42:48 +08:00
Songyang Zhang	33ceaa0eb8	[Bug] Fix bug in turbomind (#1377 )	2024-07-30 09:37:50 +08:00
Songyang Zhang	eee5a5be23	[Fix] Update get_data_path for LCBench and HumanEval (#1375 )	2024-07-29 19:28:09 +08:00
Songyang Zhang	704853e5e7	[Feature] Update pip install (#1324 ) * [Feature] Update pip install * Update Configuration * Update * Update * Update * Update Internal Config * Update collect env	2024-07-29 18:32:50 +08:00
Xingjun.Wang	edab1c07ba	[Feature] Support ModelScope datasets (#1289 ) * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * udpate dataset for modelscope support * update readme * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * update readme * remove tydiqa japanese subset * add ceval, gsm8k modelscope surpport * update race, mmlu, arc, cmmlu, commonsenseqa, humaneval and unittest * update bbh, flores, obqa, siqa, storycloze, summedits, winogrande, xsum datasets * format file * format file * update dataset format * support ms_dataset * udpate dataset for modelscope support * merge myl_dev and update test_ms_dataset * update readme * udpate dataset for modelscope support * update eval_api_zhipu_v2 * remove unused code * add get_data_path function * remove tydiqa japanese subset * update util * remove .DS_Store * fix md format * move util into package * update docs/get_started.md * restore eval_api_zhipu_v2.py, add environment setting * Update dataset * Update * Update * Update * Update --------- Co-authored-by: Yun lin <yunlin@U-Q9X2K4QV-1904.local> Co-authored-by: Yunnglin <mao.looper@qq.com> Co-authored-by: Yun lin <yunlin@laptop.local> Co-authored-by: Yunnglin <maoyl@smail.nju.edu.cn> Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>	2024-07-29 13:48:32 +08:00
jxd	12b84aeb3b	[Feature] Update CHARM Memeorziation (#1230 ) * update gemini api and add gemini models * add openai models * update CHARM evaluation * add CHARM memorization tasks * add CharmMemSummarizer (output eval details for memorization-independent reasoning analysis * update CHARM readme --------- Co-authored-by: wujiang <wujiang@pjlab.org.cn>	2024-07-26 18:42:30 +08:00
bittersweet1999	d3782c1d47	Revert "Calm dataset (#1287 )" (#1366 ) This reverts commit `edd0ffdf70`.	2024-07-26 18:27:29 +08:00
Peng Bo	edd0ffdf70	Calm dataset (#1287 ) * add calm dataset * modify config max_out_len * update README * Modify README * update README * update README * update README * update README * update README * add summarizer and modify readme * delete summarizer config comment * update summarizer * modify same response to all questions * update README	2024-07-26 11:48:16 +08:00
mqy004	a08931f214	[Fix] origin_prompt should be None in llm-compression task (#1225 ) Co-authored-by: Qinyang Mou <qinyang_mou@intsig.net>	2024-07-26 11:46:02 +08:00
LeavittLang	8ee7fecb68	Adding support for Doubao API (#1218 ) * Adding support for Doubao API * Update doubao_api.py Fixed the bug that the connection would be retried even if it was normal. * Update doubao_api.py --------- Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>	2024-07-26 11:44:51 +08:00
klein	65fad8e2ac	[Fix] minor update wildbench (#1335 ) * update crb * update crbbench * update crbbench * update crbbench * minor update wildbench * [Fix] Update doc of wildbench, and merge wildbench into subjective * [Fix] Update doc of wildbench, and merge wildbench into subjective, fix crbbench * Update crb.md * Update crb_pair_judge.py * Update crb_single_judge.py * Update subjective_evaluation.md * Update openai_api.py * [Update] update wildbench readme * [Update] update wildbench readme * [Update] update wildbench readme, remove crb * Delete configs/eval_subjective_wildbench_pair.py * Delete configs/eval_subjective_wildbench_single.py * Update __init__.py --------- Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>	2024-07-26 11:19:04 +08:00
baymax591	51a94aee01	[Bug] fix bug: delete & (#1365 ) Co-authored-by: 白超 <baichao19@huawei.com>	2024-07-26 11:03:55 +08:00
Mo Li	69aa2f2d57	[Feature] Make NeedleBench available on HF (#1364 ) * update_lint * update_huggingface format * fix bug * update docs	2024-07-25 19:01:56 +08:00
Fengzhe Zhou	c3c02c2960	update docs (#1318 ) * update docs * 高效评测 -> 数据分片 * update * update * Update faq.md --------- Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>	2024-07-25 18:44:25 +08:00
heya5	73aa55af6d	[Fix] Support HF models deployed with an OpenAI-compatible API. (#1352 ) * Support HF models deployed with an OpenAI-compatible API. * resolve lint issue * add extra_body arguments There are many other arguments when using openi-compatiable API like this: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-chat-api * fix linting issue * fix yapf linting issue	2024-07-25 18:38:23 +08:00
WANG WENJIN	0aad8199c7	Fix the summary error in subjective.py (#1363 )	2024-07-25 18:36:13 +08:00
Linchen Xiao	8127fc3518	CompassBench subjective summarizer added (#1349 ) * subjective summarizer added * fix lint	2024-07-23 12:29:57 +08:00
Que Haoran	a244453d9e	[Feature] Support inference ppl datasets (#1315 ) * commit inference ppl datasets * revised format * revise * revise * revise * revise * revise * revise	2024-07-22 17:59:30 +08:00
liushz	98c58f8a6c	[Feature] Add compassbench knowledge&math part (#1342 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench * Update accelerator * Add Doc for accelerator * Add Doc for accelerator * Add Doc for accelerator * Add Doc for accelerator * Update compassbench august wiki&math * Update compassbench august wiki&math * Update compassbench august wiki&math * Update compassbench_aug_gen_068af0.py * Update compassbench_aug_gen_068af0.py * Update --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>	2024-07-19 22:54:46 +08:00
bittersweet1999	1f9f728f22	[Feature] support compassbench Checklist evaluation (#1339 ) * fix pip version * fix pip version * support checklist eval * init * add lan * fix typo	2024-07-19 16:40:44 +08:00
Mo Li	f40add2596	[Fix] Fix lint (#1334 ) * update needlebench docs * update model_name_mapping dict * update README * fix_lint	2024-07-18 17:15:06 +08:00
Xu Song	1bfb4217ff	Fix typing and typo (#1331 )	2024-07-18 13:41:24 +08:00
Mo Li	104bddf647	[Doc] Update NeedleBench Docs (#1330 ) * update needlebench docs * update model_name_mapping dict * update README * Update README_zh-CN.md --------- Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>	2024-07-18 13:16:19 +08:00
bittersweet1999	8e7ad2e981	[Fix] add bc for alignbench summarizer (#1306 ) * fix pip version * fix pip version * fix alignbench * fix import error	2024-07-12 11:06:20 +08:00
Fengzhe Zhou	62f55987f1	force register (#1311 )	2024-07-11 19:59:35 +08:00
Fengzhe Zhou	a62c613d3e	[Sync] bump version 0.2.6+local (#1294 )	2024-07-06 00:44:06 +08:00
Fengzhe Zhou	1d3a26c732	[Doc] quick start swap tabs (#1263 ) * [doc] quick start swap tabs * update docs * update * update * update * update * update * update * update	2024-07-05 23:51:42 +08:00
bittersweet1999	68ca48496b	[Refactor] Reorganize subjective eval (#1284 ) * fix pip version * fix pip version * reorganize subjective eval * reorg sub * reorg subeval * reorg subeval * update subjective doc * reorg subeval * reorg subeval	2024-07-05 22:11:37 +08:00
baymax591	28eba6fe34	npu适配 (#1250 ) * npu适配 * Add suport for Ascend NPU * format --------- Co-authored-by: baymax591 <14428251+baymax591@user.noreply.gitee.com> Co-authored-by: Leymore <zfz-960727@163.com>	2024-07-03 18:55:19 +08:00
Fengzhe Zhou	a32f21a356	[Sync] Sync with internal codes 2024.06.28 (#1279 )	2024-06-28 14:16:34 +08:00
Xingyuan Bu	842fb1cd70	Update mtbench101.py (#1276 ) fix wrong-used import from torch.utils.data import DataLoader, Dataset	2024-06-26 00:40:22 +08:00
klein	1fa62c4a42	Support wildbench (#1266 ) Co-authored-by: Leymore <zfz-960727@163.com>	2024-06-24 13:16:27 +08:00
bittersweet1999	982e024540	[Feature] add dataset Fofo (#1224 ) * add fofo dataset * add dataset fofo	2024-06-06 11:40:48 +08:00
Xingyuan Bu	02a0a4e857	MT-Bench-101 (#1215 ) * add mt-bench-101 * add readme and requirements * add mt-bench-101 data * Update readme_mtbench101.md * update readme * update leaderboard * fix typo * Update readme_mtbench101.md * fit newest opencompass * update readme.md * mtbench101 to opencompass * mtbench101 to opencompass * for code review * for code review * for code review * hook * hook --------- Co-authored-by: liujie <ljie@buaa.edu.cn>	2024-06-03 14:52:12 +08:00
mqy004	b272803d8a	解决release版本安装后不能导入opencompass.cli.main的问题 (#1221 ) * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * Create __init__.py * format --------- Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-31 13:23:33 +08:00
bittersweet1999	7c381e5be8	[Fix] fix summarizer (#1217 ) * fix summarizer * fix summarizer	2024-05-31 11:40:47 +08:00
Fengzhe Zhou	a77b8a5cec	[Sync] format (#1214 )	2024-05-30 00:21:58 +08:00
Fengzhe Zhou	d656e818f8	[Docs] Remove --no-batch-padding and Use --hf-num-gpus (#1205 ) * [Docs] Remove --no-batch-padding and Use -hf-num-gpus * update	2024-05-29 16:30:10 +08:00
Fengzhe Zhou	2954913d9b	[Sync] bump version (#1204 )	2024-05-28 23:09:59 +08:00
liushz	ba620c4afe	Update accelerator (#1195 ) * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench * Update accelerator --------- Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>	2024-05-28 17:17:54 +08:00
jxd	608ff5810d	support CHARM (https://github.com/opendatalab/CHARM ) reasoning tasks (#1190 ) * support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks * fix lint error * add dataset card for CHARM * minor refactor * add txt --------- Co-authored-by: wujiang <wujiang@pjlab.org.cn> Co-authored-by: Leymore <zfz-960727@163.com>	2024-05-27 13:48:22 +08:00
bittersweet1999	88c14d3d04	add support for lmdeploy api judge (#1193 )	2024-05-24 23:28:56 +08:00
yaoyingyy	749e4cea71	[Fix] temporary files using tempfile (#1186 ) Co-authored-by: yaoying <yaoying@kingsoft.com>	2024-05-24 23:27:37 +08:00
Fengzhe Zhou	2b3d4150f3	[Sync] update evaluator (#1175 )	2024-05-21 14:22:46 +08:00

1 2 3 4 5 ...

401 Commits