Fengzhe Zhou
3a68083ecc
[Sync] update configs ( #734 )
2023-12-25 21:59:16 +08:00
AllentDan
336d8d76ff
add turbomind restful api support ( #693 )
...
* add turbomind restful api support
* config
* top_p 0.8
* top_k = 1
2023-12-24 01:40:00 +08:00
bittersweet1999
e985100cd1
[Fix] Fix subjective alignbench ( #730 )
2023-12-23 20:06:53 +08:00
Mo Li
0e24f4213e
[Feature] Add NeedleInAHaystack Test Support ( #714 )
...
* Add NeedleInAHaystack Test
* Apply pre-commit formatting
* Update configs/eval_hf_internlm_chat_20b_cdme.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* add needle in haystack test
* update needle in haystack test
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-12-23 12:00:51 +08:00
RunningLeon
e34c552282
[Feature] Update configs for evaluating chat models like qwen, baichuan, llama2 using turbomind backend ( #721 )
...
* add llama2 test
* fix
* test qwen chat-7b
* test w4
* add baichuan2
* update
* update
* update configs and docs
* update
2023-12-21 18:22:17 +08:00
bittersweet1999
fbb912ddf3
[Feature] Add abbr for judgemodel in subjective evaluation ( #724 )
...
* add_judgemodel_abbr
* add judgemodel abbr
2023-12-21 15:58:20 +08:00
Skyfall-xzz
b35d991786
[Feature] Add ReasonBench(Internal) dataset ( #577 )
...
* [Feature] Add reasonbench dataset
* add configs for supporting generative inference & merge datasets in the same category
* modify config filename to prompt version
* fix codes to meet pre-commit requirements
* lint the code to meet pre-commit requirements
* Align Load_data Sourcecode Briefly
* fix bugs
* reduce code redundancy
2023-12-20 17:57:42 +08:00
Jingming
76a95e9e81
[Feature] Support the use of humaneval_plus. ( #720 )
...
* [Feature] Support the use of humaneval_plus.
* [Feature] Add humaneval_plus_gen.py
* minor check
* [Fix] Fix bug
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-12-20 17:25:17 +08:00
bittersweet1999
97c2068bd9
[Feature] Add JudgeLLMs ( #710 )
...
* add judgellms
* add judgellms
* add sub_size_partition
* add docs
* add ref
2023-12-19 18:40:25 +08:00
Hubert
eda72e756e
[Fix] minor fix openai ( #711 )
2023-12-18 15:45:31 +08:00
Songyang Zhang
637628a70f
[Doc] Update Doc for Alignbench ( #707 )
...
* update alignmentbench
* update alignmentbench
* update doc
* update
* update
2023-12-15 15:07:25 +08:00
DseidLi
db2920326a
[Fix] remove redundant in gsm8k.py ( #700 )
...
Removed redundant code in GSM8KDataset.load method.
2023-12-14 19:55:58 +08:00
Songyang Zhang
bfe4aa2af5
[Fix] Update alignmentbench ( #704 )
...
* update alignmentbench
* update alignmentbench
* update alignmentbench
2023-12-14 18:24:21 +08:00
bittersweet1999
1fe152b3e8
[Feature] Support AlignmentBench infer and judge ( #697 )
...
* alignmentbench infer and judge
* alignmentbench
* alignmentbench done
* alignment all done
* alignment all done
2023-12-13 19:59:30 +08:00
Hubert
a94598d921
[Feat] update python action and slurm ( #694 )
2023-12-13 10:41:10 +08:00
bittersweet1999
6130394165
[Feature] Add double order of subjective evaluation and removing duplicated response among two models ( #692 )
...
* add features
* add doc string
* add doc string
2023-12-12 20:58:17 +08:00
Hubert
4780b39eda
[Sync] format ( #690 )
...
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-12 14:03:45 +08:00
bittersweet1999
3e77175720
[Fix] Hotfix for Subjective Evaluation ( #686 )
2023-12-12 09:22:08 +08:00
bittersweet1999
465308e430
[Feature] Add Subjective Evaluation ( #680 )
...
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
2023-12-11 22:22:11 +08:00
Hubert
4f0b373a0a
[Fix] fix docstring ( #684 )
2023-12-11 19:12:01 +08:00
Hubert
e78857ac36
[Sync] minor test ( #683 )
2023-12-11 17:42:53 +08:00
Jingming
dd4318f6ab
[Feature] enhance the ability of humaneval_postprocess ( #676 )
...
* [Feature] enhance the ability of humaneval_postprocess
* refactor
* [Feature] Keep the old version of the function and realize the new function in humaneval_postprocess_v2.
* Update opencompass/datasets/humaneval.py
---------
Co-authored-by: Leymore <zfz-960727@163.com>
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-12-11 14:39:56 +08:00
Songyang Zhang
e25c5f9525
[Enhancement] Update API Interface and Mixtral ( #681 )
...
* [Enhancement] Update API interface
* [Enhancement] Update API interface
* Update mixtral
* Update readme
2023-12-10 13:29:26 +08:00
Xiaoming Shi
1bf85949ef
[Feature] Add medbench ( #678 )
...
* update medbench
* medbench update
* format medbench
* format
---------
Co-authored-by: 施晓明 <PJLAB\shixiaoming@pjnl104220118l.pjlab.org>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-09 16:05:46 +08:00
Jingming
7cb53a95fa
[Fix] fix bug on standart_deviation summarizer ( #675 )
2023-12-08 13:38:07 +08:00
liyucheng09
05bbce8b08
[Feature] Add Data Contamination Analysis ( #639 )
...
* add contamination analysis to ceval
* fix bugs
* add contamination docs
* to pass CI check
* update
---------
Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-08 10:00:11 +08:00
bittersweet1999
1c95790fdd
New subjective judgement ( #660 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* modified to a more general way
* modified to a more general way
* final
* final
* add summarizer
* add new summarize
* fixed
* fixed
* fixed
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-12-06 13:28:33 +08:00
rolellm
e10f1c9139
added rolebench dataset. ( #633 )
...
* added rolebench
* 修改了不合理的变量名
* 修改了评论中的变量名
2023-12-01 22:54:42 +08:00
Hubert
9eb5cadcac
[Feat] update gsm8k and math agent config ( #652 )
...
* [Feat] update gsm8k and math agent config
* minor fix
2023-12-01 15:08:38 +08:00
liushz
a331c9abfd
[Feature] Add wikibench dataset ( #655 )
...
* Add WikiBench
* Add WikiBench
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-01 14:56:54 +08:00
liushz
e019c831fe
[Feature] Add Chinese version: commonsenseqa, crowspairs and nq ( #144 )
...
* add Chinese version: csqa crowspairs nq
* Update cn_data
* Update cn_data
* update format
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-30 15:33:02 +08:00
Ma Zerun
6aaf3b91ec
[Feature] Support chat style inferencer. ( #643 )
...
* [Feature] Support chat style inferencer.
* [Fix] use new prompt
* [Fix] use new prompt
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-30 14:00:06 +08:00
Fengzhe Zhou
e20d654c18
[Sync] Bump version to 0.1.9 ( #644 )
2023-11-28 11:42:43 +08:00
Hubert
d4af31bab4
[Feat] support zhipu post process ( #642 )
...
* [Feat] support zhipu post
* [Feat] support zhipu post
* [Feat] support zhipu post
2023-11-27 19:57:36 +08:00
liushz
6d0d78986c
[Feature] Add GSM_Hard dataset ( #619 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
* Add gsm_hard dataset
* Add gsm_hard dataset
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-27 17:40:34 +08:00
Fengzhe Zhou
9083dea683
[Sync] some renaming ( #641 )
2023-11-27 16:06:49 +08:00
Yang Yong
522241a8c9
[Fix] Fix lightllmapi list bug ( #635 )
2023-11-24 14:24:13 +08:00
Hubert
1884912674
[Bug] fix icl eval with nested list ( #632 )
2023-11-24 13:43:26 +08:00
Fengzhe Zhou
79f6449d85
[Doc] Update FAQ ( #628 )
...
* update faq
* Update docs/zh_cn/get_started/faq.md
* Update docs/en/get_started/faq.md
* Update docs/zh_cn/get_started/faq.md
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-23 18:19:17 +08:00
Fengzhe Zhou
d949e3c003
[Feature] Add circular eval ( #610 )
...
* refactor default, add circular summarizer
* add circular
* update impl
* update doc
* minor update
* no more to be added
2023-11-23 16:45:47 +08:00
Songyang Zhang
5202456b4c
[API] Update API ( #624 )
...
* update api
* update generation_kwargs impl
* update api
* refactor
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-23 15:06:20 +08:00
Fengzhe Zhou
d4d1330a5a
[Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes ( #625 )
2023-11-23 14:05:59 +08:00
Kevin Wang
c0785e53d8
[Feature] support download from modelscope ( #534 )
...
* [Feature] download from modelscope
* [Feature] download from modelscope
* minor fix
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-22 15:32:21 +08:00
liushz
048775192b
[Feature] Add SVAMP dataset ( #604 )
...
* Add SVAMP dataset
* Add SVAMP dataset
* Add SVAMP dataset
2023-11-22 14:54:39 +08:00
Fengzhe Zhou
fb30b7c7a2
[Fix] Fix gen inferencer ( #615 )
2023-11-22 12:04:31 +08:00
Songyang Zhang
721a45c68f
[Bug] Update api with generation_kargs ( #614 )
...
* update api
* update generation_kwargs impl
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-22 10:02:57 +08:00
Lyu Han
eb56fd6d16
Integrate turbomind python api ( #484 )
...
* integrate turbomind python api
* update
* update user guide
* update
* fix according to reviewer's comments
* fix error
* fix linting
* update user guide
* remove debug log
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-21 22:34:46 +08:00
Songyang Zhang
d925748266
[Feature] Support 360API and FixKRetriever for CSQA dataset ( #601 )
...
* [Feature] Support 360API and FixKRetriever for CSQA dataset
* Update API
* Update API
* [Feature] Support 360API and FixKRetriever for CSQA dataset
* Update API
* Update API
* rm mathbench
* fix_lint
* Update opencompass/models/bytedance_api.py
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
* update
* update
* update
---------
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-11-21 20:25:47 +08:00
Yang Yong
d3b0d5c4ce
[Feature] Support Lightllm API ( #613 )
...
* [Feature] Support Lightllm api
* formatting & renaming
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-21 19:18:40 +08:00
Yuan Feng
7199acc25d
Add support for DataCanvas Alaya LM ( #612 )
...
* Support for Alaya
* Remove useless requirements
2023-11-21 17:51:30 +08:00