Songyang Zhang
bfe4aa2af5
[Fix] Update alignmentbench ( #704 )
...
* update alignmentbench
* update alignmentbench
* update alignmentbench
2023-12-14 18:24:21 +08:00
bittersweet1999
1fe152b3e8
[Feature] Support AlignmentBench infer and judge ( #697 )
...
* alignmentbench infer and judge
* alignmentbench
* alignmentbench done
* alignment all done
* alignment all done
2023-12-13 19:59:30 +08:00
Hubert
a94598d921
[Feat] update python action and slurm ( #694 )
2023-12-13 10:41:10 +08:00
bittersweet1999
6130394165
[Feature] Add double order of subjective evaluation and removing duplicated response among two models ( #692 )
...
* add features
* add doc string
* add doc string
2023-12-12 20:58:17 +08:00
Hubert
4780b39eda
[Sync] format ( #690 )
...
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-12 14:03:45 +08:00
bittersweet1999
3e77175720
[Fix] Hotfix for Subjective Evaluation ( #686 )
2023-12-12 09:22:08 +08:00
bittersweet1999
465308e430
[Feature] Add Subjective Evaluation ( #680 )
...
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
2023-12-11 22:22:11 +08:00
Hubert
4f0b373a0a
[Fix] fix docstring ( #684 )
2023-12-11 19:12:01 +08:00
Hubert
e78857ac36
[Sync] minor test ( #683 )
2023-12-11 17:42:53 +08:00
Jingming
dd4318f6ab
[Feature] enhance the ability of humaneval_postprocess ( #676 )
...
* [Feature] enhance the ability of humaneval_postprocess
* refactor
* [Feature] Keep the old version of the function and realize the new function in humaneval_postprocess_v2.
* Update opencompass/datasets/humaneval.py
---------
Co-authored-by: Leymore <zfz-960727@163.com>
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-12-11 14:39:56 +08:00
Songyang Zhang
e25c5f9525
[Enhancement] Update API Interface and Mixtral ( #681 )
...
* [Enhancement] Update API interface
* [Enhancement] Update API interface
* Update mixtral
* Update readme
2023-12-10 13:29:26 +08:00
Xiaoming Shi
1bf85949ef
[Feature] Add medbench ( #678 )
...
* update medbench
* medbench update
* format medbench
* format
---------
Co-authored-by: 施晓明 <PJLAB\shixiaoming@pjnl104220118l.pjlab.org>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-09 16:05:46 +08:00
Jingming
7cb53a95fa
[Fix] fix bug on standart_deviation summarizer ( #675 )
2023-12-08 13:38:07 +08:00
liyucheng09
05bbce8b08
[Feature] Add Data Contamination Analysis ( #639 )
...
* add contamination analysis to ceval
* fix bugs
* add contamination docs
* to pass CI check
* update
---------
Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-08 10:00:11 +08:00
bittersweet1999
1c95790fdd
New subjective judgement ( #660 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* modified to a more general way
* modified to a more general way
* final
* final
* add summarizer
* add new summarize
* fixed
* fixed
* fixed
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-12-06 13:28:33 +08:00
rolellm
e10f1c9139
added rolebench dataset. ( #633 )
...
* added rolebench
* 修改了不合理的变量名
* 修改了评论中的变量名
2023-12-01 22:54:42 +08:00
Hubert
9eb5cadcac
[Feat] update gsm8k and math agent config ( #652 )
...
* [Feat] update gsm8k and math agent config
* minor fix
2023-12-01 15:08:38 +08:00
liushz
a331c9abfd
[Feature] Add wikibench dataset ( #655 )
...
* Add WikiBench
* Add WikiBench
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-01 14:56:54 +08:00
liushz
e019c831fe
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq ( #144 )
...
* add Chinese version: csqa crowspairs nq
* Update cn_data
* Update cn_data
* update format
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-30 15:33:02 +08:00
Ma Zerun
6aaf3b91ec
[Feature] Support chat style inferencer. ( #643 )
...
* [Feature] Support chat style inferencer.
* [Fix] use new prompt
* [Fix] use new prompt
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-30 14:00:06 +08:00
Fengzhe Zhou
e20d654c18
[Sync] Bump version to 0.1.9 ( #644 )
2023-11-28 11:42:43 +08:00
Hubert
d4af31bab4
[Feat] support zhipu post process ( #642 )
...
* [Feat] support zhipu post
* [Feat] support zhipu post
* [Feat] support zhipu post
2023-11-27 19:57:36 +08:00
liushz
6d0d78986c
[Feature] Add GSM_Hard dataset ( #619 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
* Add gsm_hard dataset
* Add gsm_hard dataset
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-27 17:40:34 +08:00
Fengzhe Zhou
9083dea683
[Sync] some renaming ( #641 )
2023-11-27 16:06:49 +08:00
Yang Yong
522241a8c9
[Fix] Fix lightllmapi list bug ( #635 )
2023-11-24 14:24:13 +08:00
Hubert
1884912674
[Bug] fix icl eval with nested list ( #632 )
2023-11-24 13:43:26 +08:00
Fengzhe Zhou
79f6449d85
[Doc] Update FAQ ( #628 )
...
* update faq
* Update docs/zh_cn/get_started/faq.md
* Update docs/en/get_started/faq.md
* Update docs/zh_cn/get_started/faq.md
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-23 18:19:17 +08:00
Fengzhe Zhou
d949e3c003
[Feature] Add circular eval ( #610 )
...
* refactor default, add circular summarizer
* add circular
* update impl
* update doc
* minor update
* no more to be added
2023-11-23 16:45:47 +08:00
Songyang Zhang
5202456b4c
[API] Update API ( #624 )
...
* update api
* update generation_kwargs impl
* update api
* refactor
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-23 15:06:20 +08:00
Fengzhe Zhou
d4d1330a5a
[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes ( #625 )
2023-11-23 14:05:59 +08:00
Kevin Wang
c0785e53d8
[Feature] support download from modelscope ( #534 )
...
* [Feature] download from modelscope
* [Feature] download from modelscope
* minor fix
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-22 15:32:21 +08:00
liushz
048775192b
[Feature] Add SVAMP dataset ( #604 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
2023-11-22 14:54:39 +08:00
Fengzhe Zhou
fb30b7c7a2
[Fix] Fix gen inferencer ( #615 )
2023-11-22 12:04:31 +08:00
Songyang Zhang
721a45c68f
[Bug] Update api with generation_kargs ( #614 )
...
* update api
* update generation_kwargs impl
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-22 10:02:57 +08:00
Lyu Han
eb56fd6d16
Integrate turbomind python api ( #484 )
...
* integrate turbomind python api
* update
* update user guide
* update
* fix according to reviewer's comments
* fix error
* fix linting
* update user guide
* remove debug log
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-21 22:34:46 +08:00
Songyang Zhang
d925748266
[Feature] Support 360API and FixKRetriever for CSQA dataset ( #601 )
...
* [Feature] Support 360API and FixKRetriever for CSQA dataset
* Update API
* Update API
* [Feature] Support 360API and FixKRetriever for CSQA dataset
* Update API
* Update API
* rm mathbench
* fix_lint
* Update opencompass/models/bytedance_api.py
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
* update
* update
* update
---------
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-11-21 20:25:47 +08:00
Yang Yong
d3b0d5c4ce
[Feature] Support Lightllm API ( #613 )
...
* [Feature] Support Lightllm api
* formatting & renaming
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-21 19:18:40 +08:00
Yuan Feng
7199acc25d
Add support for DataCanvas Alaya LM ( #612 )
...
* Support for Alaya
* Remove useless requirements
2023-11-21 17:51:30 +08:00
liushz
dbacd36379
Add aritch to mathbench ( #607 )
2023-11-20 19:40:41 +08:00
liushz
c9c5c5d92e
Mathbench update postprocess ( #600 )
...
* Update mathbench
* Update mathbench
2023-11-20 16:48:55 +08:00
Jingming
5e75e29711
[Feature] Add multi-prompt generation demo ( #568 )
...
* [Feature] Add multi-prompt generation demo
* [Fix] change form in winogrande_gen_XXX.py
* [Fix] make multi prompt demo more directly
* [Fix] fix bug
* [Fix] minor fix
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-20 16:16:37 +08:00
Hubert
91fba2c2e9
[Feat] support humaneval and mbpp pass@k ( #598 )
...
* [Feat] support pass@ k
* [Feat] support pass@k
* [Feat] support pass@k
* [Feat] support pass@k
* [Feat] support pass@k
* [Feat] support pass@k docs
* update naming
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-16 21:22:06 +08:00
Raymond Zhang
c0acd06b05
[Feature] Add FinanceIQ dataset ( #596 )
2023-11-16 17:47:57 +08:00
Hubert
fcab30f82e
[Fix] change save_every defaults to 1 ( #592 )
2023-11-15 13:00:25 +08:00
Fengzhe Zhou
19ad7f9613
fix cmb dataset ( #587 )
2023-11-14 16:13:39 +08:00
Wei Jueqi
14e6fe6f13
Fix bugs in subjective evaluation ( #589 )
...
* rename
* fix sub bugs and update docs
* update
* update
2023-11-14 16:11:55 +08:00
Fengzhe Zhou
1ea88d5822
[Sync] Bump version to 0.1.8 ( #576 )
2023-11-13 16:00:38 +08:00
Fengzhe Zhou
d3de5c41fb
[Sync] update model configs ( #574 )
2023-11-13 15:15:34 +08:00
Fengzhe Zhou
689ffe5b63
[Feature] Use dataset in local path ( #570 )
...
* update commonsenseqa
* update drop
* update flores_first100
* update gsm8k
* update humaneval
* update lambda
* update obqa
* update piqa
* update race
* update siqa
* update story_cloze
* update strategyqa
* update tydiqa
* update winogrande
* update doc
* update hellaswag
* fix obqa
* update collections
* update .zip name
2023-11-13 13:00:37 +08:00
Fengzhe Zhou
d6aaac22e7
[Feature] Update cmb ( #571 )
2023-11-13 00:09:05 +08:00
Songyang Zhang
9e42cb163b
[Feature] Update xunfei api ( #572 )
...
* update xunfei api
* fix lint
* avoid warning
2023-11-10 22:46:06 +08:00
jingmingzhuo
b3cbef3226
[Feature] Add py150 and maxmin ( #562 )
...
* [feat] add clozeTesst_maxmin dataset
* [feat] add py150 datasets
* [feat] change __init__.py in opencompass/datasets
* [fix] pre-commit check
* [fix] rename py150 and masxmin datasets in configs
* [feat] add gen.py of py150 and maxmin in configs/datasets
2023-11-09 22:05:25 +08:00
Hubert
889a6b26ae
[Fix] fix log re-direct ( #564 )
2023-11-09 19:34:19 +08:00
Hubert
cf5a6d1ab7
[Fix] fix unnecessary import and update requirements ( #555 )
2023-11-08 17:58:49 +08:00
Hubert
9f8a721313
[Fix] fix registry error with internal ( #551 )
...
* [Fix] fix conflict with internal
* [Fix] fix conflict with internal
2023-11-07 20:01:23 +08:00
Hubert
bb2ecf416e
[Feat] Support cibench ( #538 )
...
* [Feat] support cidataset
* [Feat] support cidataset
* [Feat] support cidataset
* [Feat] support cidataset
* minor fix
* minor fix
* minor fix
* minor fix
* minor fix
* minor fix
* rename cibench
* rename cibench
* rename cibench
* rename cibench
* minor fix
* minor fix
* minor fix
2023-11-07 19:11:44 +08:00
Songyang Zhang
239c2a346e
[Feature] Add support for MiniMax API ( #548 )
...
* update requirement
* update requirement
* update with minimax
* update api model
* Update readme
* fix error
---------
Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2023-11-06 21:57:32 +08:00
Hubert
1ccdfaa623
[Feat] support xunfei api ( #547 )
2023-11-06 19:29:26 +08:00
Yuan Liu
6e31520128
[Feature]: To be compatible with the latest version of MiniGPT-4 ( #539 )
...
* [Feature]: To be compatible with the latest version of MiniGPT-4
* [Feature]: User try and except
Co-authored-by: Fengzhe Zhou <zfz-960727@163.com>
* [Fix]: Fix lint
---------
Co-authored-by: bensenliu <bensenliu@tencent.com>
Co-authored-by: Fengzhe Zhou <zfz-960727@163.com>
2023-11-04 09:50:36 +08:00
bittersweet1999
f25a980043
[fFeat] Add an opensource dataset Tabmwp ( #505 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-11-03 11:15:46 +08:00
Hubert
b9270c3a60
[Fix] Fix local debug mode not restrict the resources ( #522 )
...
* [Fix] fix local debug mode not restrict the resources
* minor fix
2023-10-30 18:13:43 +08:00
Qing
e2355a2ede
[Feature] Add multi model viz ( #509 )
...
* add viz_multi_model.py tool
* Modify the viz_multi_model.py script according to the review
* highlight multiple optimal scores
---------
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-30 12:11:33 +08:00
Fengzhe Zhou
6a398d171c
Bump version to 0.1.7 ( #518 )
2023-10-27 20:32:27 +08:00
Fengzhe Zhou
dbb20b8270
[Sync] update ( #517 )
2023-10-27 20:31:22 +08:00
Hubert
6f07af3039
[Feat] Support local runner for windows ( #515 )
2023-10-27 17:16:22 +08:00
Fengzhe Zhou
df07391ed8
[Fix] Enforce do_sample=False
in HF model ( #506 )
...
* update hf model wrapper
* patch llama
---------
Co-authored-by: bot <bot@bot.com>
2023-10-27 16:54:19 +08:00
Wei Jueqi
b62842335d
[Doc] Update Subjective docs ( #510 )
...
* rename
* add en subdoc
* fix name
* fix writing
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-27 16:27:24 +08:00
Fengzhe Zhou
e3d4901bed
[Feat] Add _set_model_kwargs_torch_dtype for HF model ( #507 )
...
* add _set_model_kwargs_torch_dtype for hf models
* add logger
2023-10-27 11:45:41 +08:00
Fengzhe Zhou
6405cd2db5
use example summarizer by default ( #508 )
2023-10-27 11:45:29 +08:00
Hubert
b3f5d9e421
[Feat] support math/gms8k agent config ( #494 )
...
* support math agent
* support gsm8k agent
* support gsm8k agent
* minor fix
* minor fix
* minor fix
* Update configs/eval_codeagent.py
2023-10-25 23:05:15 +08:00
Hubert
ac3a2c4501
[Feat] local api speed up with fixed concurrent users ( #497 )
...
* [Feat] local api speed up
* fix lint
* fix lint
* minor fix
* add example api
2023-10-25 21:12:20 +08:00
Leymore
4dd9a3fc10
[Sync] sync with internal codes 20231019 ( #488 )
2023-10-18 23:37:35 -05:00
liushz
2737249f31
[Feature] Add mathbench dataset and circular evaluator ( #408 )
...
* add_mathbench
* update mathbench
* support non circular eval dataset
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-10-18 04:08:31 -05:00
Leymore
fccfcb6f5b
fix summary default ( #483 )
2023-10-17 11:32:38 +08:00
Leymore
6317da08b3
Bump version to 0.1.6 ( #478 )
2023-10-13 06:54:51 -05:00
Leymore
7d9e386821
[Fix] Split if and only if complete eos string shows up ( #477 )
2023-10-13 06:52:20 -05:00
Leymore
861942ab1b
[Feature] Add lawbench ( #460 )
...
* add lawbench
* update requirements
* update
2023-10-13 06:51:36 -05:00
Leymore
fbf5089c40
[Sync] update github token ( #475 )
2023-10-13 06:50:54 -05:00
Leymore
362c33dff4
fix jieba rouge ( #467 )
2023-10-12 10:25:19 +08:00
Leymore
d7ff933a73
[Fix] Use jieba rouge in lcsts ( #459 )
...
* use jieba rouge in lcsts
* use rouge_chinese
2023-10-09 10:10:33 +08:00
Tong Gao
119bfd1569
[Refactor] Move fix_id_list to Retriever ( #442 )
...
* [Refactor] Move fix_id_list to Retriever
* update
* move to base
* fix
2023-10-07 12:53:41 +08:00
Lyu Han
6738247142
Integrate turbomind inference via its RPC API instead of its python API ( #414 )
...
* support tis
* integrate turbomind inference via its RPC API instead of its python API
* update guide
* update ip address spec
* update according to reviewer's comments
2023-10-07 10:27:48 +08:00
Leymore
9db5652638
[Feature] re-implement ceval load dataset ( #446 )
2023-09-27 21:18:48 +08:00
Hubert
d9f3e88dfe
[Fix] fix clp potential error and support bs>1 ( #439 )
...
* [Fix] fix clp potential error and support bs>1
* [Fix] fix clp potential error and support bs>1
* minor fix
* minor fix
2023-09-27 16:32:57 +08:00
philipwangOvO
3bb3d330eb
[Sync] Update LongEval ( #443 )
2023-09-27 16:32:40 +08:00
Tong Gao
9b21613d17
Bump version to 0.1.5 ( #432 )
2023-09-22 19:17:23 +08:00
chenbohua3
b2926eac8f
[Feature] support customize config path ( #423 )
...
* support customize config path
* support customize config path
* support customize config path
2023-09-22 19:12:02 +08:00
liushz
c5224c2a91
[Feature] Add kaoshi dataset ( #392 )
...
* Add ToT method
* Update ToT
* Update ToT
* Update ToT
* Update ToT
* Update ToT
* Add Koashi
* Update Kaoshi
* Update Kaoshi
* Update kaoshi
* Update kaoshi
* Update Kaoshi
* Update Kaoshi
* Update Kaoshi
* Update Kaoshi
* update Kaoshi
* update
* update
* fix
---------
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-09-22 18:46:33 +08:00
TTTTTiam
2a62bea1a4
add evaluation of scibench ( #393 )
...
* add evaluation of scibench
* add evaluation of scibench
* update scibench
* remove scibench evaluator
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 17:42:08 +08:00
Tong Gao
07574fddbb
[Fix] keep keys ( #431 )
2023-09-22 17:30:54 +08:00
Tong Gao
a1ea3c094a
[Sync] Initial support of subjective evaluation ( #421 )
...
Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 15:42:31 +08:00
Ma Zerun
0f2c388280
Support GSM8k evaluation with tools by Lagent and LangChain ( #277 )
...
* Support GSM8k evaluation with tools by Lagent and LangChain
* Avoid to use MMEngine new feature
* update document
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-22 15:28:22 +08:00
Tong Gao
681d3013de
[Feature] Log gold answer in prediction output ( #419 )
...
* [Feature] Log gold answer in prediction output
* support clp golden ans
* minor fix
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-09-22 12:44:40 +08:00
Yike Yuan
97fdc51102
[Fix] Fix performance issue of visualglm. ( #424 )
...
* [Fix] Visualglm performance fixed.
* [Fix] Hide ckpt path.
2023-09-21 19:54:23 +08:00
Hubert
8803f7f7a6
[Feat] support antropics evals dataset ( #422 )
...
* [Feat] support anthropics ai risk dataset
* [Feat] support anthropics evals dataset
* [Feat] support anthropics evals dataset
2023-09-20 18:36:44 +08:00
Leymore
ae0cd8752f
[Feature] Use local accuracy from hf implements ( #416 )
...
* use local accuracy from hf implements
* add load from hf fallback
2023-09-20 16:35:22 +08:00
Zequn Liu
ff2c15a09f
[fix] summarizer debug logger ( #417 )
2023-09-20 15:29:26 +08:00
Yike Yuan
bd50bad8b5
[Feat] Support mm models on public dataset and fix several issues. ( #412 )
...
* [Feat] Add public dataset support for visualglm, qwenvl, and flamingo
* [Fix] MMBench related changes.
* [Fix] Openflamingo inference.
* [Fix] Hide ckpt path.
* [Fix] Pre-commit.
---------
Co-authored-by: Haodong Duan <dhd.efz@gmail.com>
2023-09-19 19:08:44 +08:00
Yuanhan Zhang
7c2726c23b
[Model] Yhzhang/add mlugowl llamaadapter ( #405 )
...
* refine gitignore
* [Feature]: Add minigpt-4
* [Feature]: Add mm local runner
* [Feature]: Add instructblip
* add otter and llama-adapter
* add owl
* add llama2-adapter and owl
* lint
* [Feature]: Add minigpt-4
* [Feature]: Add instructblip
* add otter and llama-adapter
* add owl
* add llama2-adapter and owl
* lint
* lint
* update
* lint
* lint
* add __init__.py
* update
* update
* update
* update
* [Feature]: Add minigpt-4
* [Feature]: Add mm local runner
* [Feature]: Add instructblip
* add otter and llama-adapter
* add owl
* add llama2-adapter and owl
* lint
* [Feature]: Add minigpt-4
* [Feature]: Add instructblip
* add otter and llama-adapter
* add owl
* add llama2-adapter and owl
* lint
* lint
* update
* lint
* lint
* add __init__.py
* update
* update
* update
* update
* optimize mmbench dataset args
* update
* update
* run commit hook
---------
Co-authored-by: liuyuan <3463423099@qq.com>
Co-authored-by: kennymckormick <dhd@pku.edu.cn>
Co-authored-by: kennymckormick <dhd.efz@gmail.com>
2023-09-19 14:21:26 +08:00
so2liu
267401bded
[Feat] add custom summarizer argument in CLI run mode 在CLI启动模式中添加自定义Summarizer参数 ( #411 )
...
* feat: add custom summarizer in CLI run mode
* feat: search local config by match_cfg_file
2023-09-18 18:11:22 +08:00