Commit Graph

737 Commits

Author SHA1 Message Date
bittersweet1999
a2e9bc0c41
[Fix] fix duplicate error in partitioner (#1552)
* fix pip version

* fix pip version

* fix duplicate error in paritioner

* fix duplicate error in paritioner
2024-09-23 19:45:21 +08:00
x54-729
335667183a
[Feature] Add Interntrain model support (#1548)
Co-authored-by: x54-729 <xingshuhao.dispatch@pjlab.org.cn>
2024-09-23 19:10:26 +08:00
klein
24915aeb3f
[BUG] Update CIbench config(#1544)
* BUG: Update cibench.py

* BUG: Update cibench.py
2024-09-23 18:32:27 +08:00
liushz
a0cfd61129
[Feature] Update MathBench & Math base model config (#1550)
* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

* Update GPQA & MMLU_Pro

* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

* Update MathBench & Math base config

---------

Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>
2024-09-23 14:03:59 +08:00
Songyang Zhang
ee058e25b2
[Feature] Support verbose for OpenAI API (#1546) 2024-09-20 17:12:52 +08:00
hailsham
a81bbb85bf
[FIX] Added handling for the "begin section" in meta_template to APITemplateParser (#1405)
Co-authored-by: leifei <nuuooo@icloud.com>
2024-09-19 18:12:04 +08:00
Songyang Zhang
5a27c2bd6f
[Model] Support Qwen2.5 Instruct (#1543) 2024-09-19 16:16:07 +08:00
Songyang Zhang
be460fbb21
[Feature] Support OpenAI O1 models (#1539)
* [Feature] Support OpenAI O1 models

* Update README.md

---------

Co-authored-by: liushz <qq1791167085@163.com>
2024-09-18 22:41:17 +08:00
liushz
2e9db77d57
[Feature] Add custom model postprocess function (#1519)
Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>
2024-09-18 14:40:51 +08:00
liushz
c9a7026f59
[Feature] Update MathBench & WikiBench for FullBench (#1521)
* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

* Update GPQA & MMLU_Pro

* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

* Update MathBench & WikiBench for FullBench

---------

Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>
2024-09-18 14:35:30 +08:00
Songyang Zhang
cfbd308edf
[Doc] Update README (#1528)
* '

* Update
2024-09-14 16:02:17 +08:00
Linchen Xiao
90279b6461
[Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) 2024-09-13 10:30:43 +08:00
Songyang Zhang
6997990c93
[Feature] Update Models (#1518)
* Update Models

* Update

* Update humanevalx

* Update

* Update
2024-09-12 23:35:30 +08:00
zhulinJulia24
3754dc1b67
update (#1522)
Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-09-12 15:00:52 +08:00
bittersweet1999
7c7fa36235
[Feature] add support for internal Followbench (#1511)
* fix pip version

* fix pip version

* add internal followbench

* add internal followbench

* fix lint

* fix lint
2024-09-11 13:32:34 +08:00
Linchen Xiao
317763381c
update (#1517) 2024-09-11 13:31:20 +08:00
bittersweet1999
c2bcd8725e
[Fix] Fix wildbench (#1508)
* fix pip version

* fix pip version

* fix_wildbench
2024-09-10 17:35:07 +08:00
Alexander Lam
a31a77c5c1
[Feature] Add SciCode summarizer config (#1514)
* [Feature] added SciCode  summarizer config and dataset config for with background evaluation

* fix lint issues

* removed unnecessary type in summarizer group
2024-09-10 16:06:02 +08:00
Mo Li
5b93592242
[Fix] Fix link-check workflow by adjusting line breaks in URL ignore patterns (#1507)
* update link-check

* update link-check

* update link-check
2024-09-10 10:20:40 +08:00
Linchen Xiao
b5f8afb57b
[Bump] Bump version to 0.3.2.post1 2024-09-06 19:09:30 +08:00
Linchen Xiao
f04f3546bc
[Fix] Import fix (#1500) 2024-09-06 18:29:24 +08:00
Linchen Xiao
ff18545f0e
[Bump] Bump version to 0.3.2 (#1497) 2024-09-06 16:10:45 +08:00
Linchen Xiao
87ffa71d68
[Feature] Longbench dataset update 2024-09-06 15:50:12 +08:00
Albert Yan
928d0cfc3a
[Feature] Add support for Rendu API (#1468)
* Add support for Rendu API

* fix lint issue

* fix lint issue

* fix lint issue

* Update

---------

Co-authored-by: 13190 <zeyu.yan@transn.com>
Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>
2024-09-06 01:00:43 +08:00
Hari Seldon
faf5260155
[Feature] Optimize Evaluation Speed of SciCode (#1489)
* update scicode

* update comments

* remove redundant variable

* Update

---------

Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>
2024-09-06 00:59:41 +08:00
liushz
00fc8da5be
[Feature] Add model postprocess function (#1484)
* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

* Add model postprocess function

---------

Co-authored-by: liushz <liuhongwei@pjlab.rog.cn>
2024-09-05 21:10:29 +08:00
Maxime SHE
45efdc994d
[Feature] Add an attribute api_key into TurboMindAPIModel default None (#1475)
Co-authored-by: Maxime <maximeshe@163.com>
Add an attribute api_key into TurboMindAPIModel default None then we can set the api_key while using lmdeploy to deploy the llm model
2024-09-05 17:51:16 +08:00
Linchen Xiao
6c9cd9a260
[Feature] Needlebench auto-download update (#1480)
* update

* update

* update
2024-09-05 17:22:42 +08:00
zhulinJulia24
716d46e1f5
[ci] fix badcase and add env info (#1491)
* update

* update

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-09-05 16:43:45 +08:00
zhulinJulia24
fb6a0df652
[ci] fix test env for vllm and add vllm baselines (#1481)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-09-04 19:24:09 +08:00
Linchen Xiao
da74cbfa39
[Fix] Model configs update 2024-09-04 18:57:10 +08:00
Linchen Xiao
95aad6c282
[Fix] Requirements update 2024-09-03 18:50:40 +08:00
Linchen Xiao
9693be46b7
[Feature] Mmlu-pro auto-download (#1464)
* update

* update

* update

* update

* update
2024-08-30 10:03:40 +08:00
zhulinJulia24
f34209766d
[ci] fix test env (#1470)
* Update daily-run-test.yml

* Update daily-run-test.yml

* Update pr-run-test.yml

* Update daily-run-test.yml

* update

* update

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-08-29 14:48:17 +08:00
Alexander Lam
8b39225259
[Feature] Added extra_body support for OpenAISDK; Added support for proxy URL when connecting to OpenAI's API. (#1467)
* fix lint issues

* fix lint issues
2024-08-29 00:43:43 +08:00
Guoli Yin
a488b9b4f5
[Feature] Make OPENAI_API_BASE compatible with openai default env (#1461)
* Make OPENAI_API_BASE compatible with openai default env

* Make OPENAI_API_BASE compatible with openai default env

---------

Co-authored-by: Guoli Yin <gyin@icloud.com>
2024-08-28 23:14:41 +08:00
Songyang Zhang
e5a8eb2283
[Feature] Update Lint and Leaderboard (#1458)
* [Feature] Update Lint and Leaderboard

* Update

* Update
2024-08-28 22:36:42 +08:00
Linchen Xiao
245664f4c0
[Feature] Fullbench v0.1 language update (#1463)
* update

* update

* update

* update
2024-08-28 14:01:05 +08:00
CHEN PENGAN
463231c651
[Feature] Add icl_sliding_k_retriever.py and update __init__.py (#1305)
* Add icl_sliding_k_retriever.py and update __init__.py

* Fix flake8, isort, and yapf issues for Sliding Window Retriever
2024-08-23 17:18:31 +08:00
Linchen Xiao
94b6bd65fc
[Fix] Fix cli evaluation for multiple models (#1454)
* update

* update
2024-08-23 17:15:36 +08:00
Linchen Xiao
2295a33a18
[Doc] Update readme (#1453) 2024-08-23 14:11:01 +08:00
Songyang Zhang
5485207fbe
[Bump] Bump version to 0.3.1 (#1450)
* [Bump] Bump version 0.3.1

* Update
2024-08-23 10:47:57 +08:00
Songyang Zhang
7c2d25b557
[Fix] Update SciCode and Gemma model (#1449)
* [Fix] Update SciCode and Gemma model

* Update

* Update
2024-08-23 10:42:27 +08:00
Xu Song
ad3931aa32
Update openicl_infer.py (#1308) 2024-08-23 10:39:22 +08:00
zhulinJulia24
fb69ba5eb8
[CI] add commond testcase into daily testcase (#1447)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-08-23 01:49:17 +08:00
liushz
9fdbc744dc
[Fix] Update option postprocess & mathbench language summarizer (#1413)
* Update option postprocess & mathbench language summarizer

* Update option postprocess & mathbench language summarizer

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-08-22 14:49:07 +08:00
Linchen Xiao
0fe9756c5d
[Doc] Update Readme (#1439)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2024-08-22 14:48:45 +08:00
Hari Seldon
14b4b735cb
[Feature] Add support for SciCode (#1417)
* add SciCode

* add SciCode

* add SciCode

* add SciCode

* add SciCode

* add SciCode

* add SciCode

* add SciCode w/ bg

* add scicode

* Update README.md

* Update README.md

* Delete configs/eval_SciCode.py

* rename

* 1

* rename

* Update README.md

* Update scicode.py

* Update scicode.py

* fix some bugs

* Update

* Update

---------

Co-authored-by: root <HariSeldon0>
Co-authored-by: tonysy <sy.zhangbuaa@gmail.com>
2024-08-22 13:42:25 +08:00
liushz
d3963bceae
[Bug] Add model support for 'huggingface_above_v4_33' when using '-a' (#1430)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-08-22 13:40:24 +08:00
seetimee
ac093fce53
[Update] Update openai_api.py (#1438)
Most models' token limits are above 32k. It will fix long context dataset test bug of skiping some data.
2024-08-21 18:57:49 +08:00