Commit Graph

546 Commits

Author SHA1 Message Date
bittersweet1999
001e77fea2
[Feature] add support for gemini (#931)
* add gemini

* add gemini

* add gemini
2024-02-28 19:38:34 +08:00
Fengzhe Zhou
9afbfa3639
[Sync] Fix TEvalEvaluator (#929) 2024-02-28 16:05:30 +08:00
Fengzhe Zhou
ba7cd58da3
[Update] Rename dataset pack (#922) 2024-02-28 10:54:04 +08:00
Fengzhe Zhou
5ce8e0450e
[Fix] Fix type hint in IFEval (#915) 2024-02-28 10:53:40 +08:00
Jingming
53fe788d27
[Fix] fix ifeval (#909) 2024-02-23 16:52:03 +08:00
bittersweet1999
45c606bcd0
[Fix] Fix IFEval (#906)
* fix ifeval

* fix ifeval

* fix ifeval

* fix ifeval
2024-02-22 16:51:34 +08:00
RunningLeon
32ba0b074e
Support lmdeploy pytorch engine (#875)
* add lmdeploy pytorch model

* fix

* speed up encoding and decoding

* fix

* change tokenizer
2024-02-22 03:46:07 -03:00
Xu Song
6d04decab4
[Fix] Fix moss template config (#897) 2024-02-21 11:19:24 +08:00
Fengzhe Zhou
2b7d376e3d
[Fix] Fix chatglm2 config (#893) 2024-02-19 14:55:53 +08:00
Fengzhe Zhou
9119e2ac39
[Fix] rename qwen2-beta -> qwen1.5 (#894) 2024-02-19 14:55:35 +08:00
Yang Yong
b6e21ece38
Support LightllmApi input_format (#888) 2024-02-19 10:02:59 +08:00
Fengzhe Zhou
08133e060a
[Sync] Bump version to 0.2.2 (#880) 2024-02-07 10:45:48 +08:00
hailsham
e257254b00
[Feature] add global retriever config (#842)
* add global retriever config

* give zero shot overwrite example

* give zero shot overwrite example

---------

Co-authored-by: Lei Fei <SENSETIME\leifei1@cn3114002087l.domain.sensetime.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-07 00:30:20 +08:00
hailsham
dd444685bb
fix bug of gsm8k_postprocess (#863)
* fix bug of gsm8k_postprocess

* update postprocess

---------

Co-authored-by: Lei Fei <SENSETIME\leifei1@cn3114002087l.domain.sensetime.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-06 23:52:47 +08:00
Connor-Shen
444d8d9507
[feat] support multipl-e (#846)
* [feat] support humaneval_multipl-e

* format

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-06 23:30:28 +08:00
Yggdrasill7D6
a6c49f15ce
fix lawbench 2-1 f0.5 score calculation bug (#795)
* fix lawbench 2-1 f0.5 score calculation bug

* use path in overall datasets folder

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-06 22:20:11 +08:00
bittersweet1999
1c8e193de8
[Fix] hotfix for mtbench (#877)
* hotfix for mtbench

* hotfix
2024-02-06 21:26:47 +08:00
Fengzhe Zhou
d34ba11106
[Sync] Merge branch 'dev' into zfz/update-keyset-demo (#876) 2024-02-05 23:29:10 +08:00
bittersweet1999
32b5948f4e
[Fix] add do sample demo for subjective dataset (#873)
* add do sample demo for subjective dataset

* fix strings

* format

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-05 15:55:58 +08:00
Skyfall-xzz
7ad1168062
Support NPHardEval (#835)
* support NPHardEval

* add .md file and fix minor bugs

* refactor and minor fix

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-05 15:52:28 +08:00
zhulinJulia24
b4a9acd7be
Update daily test (#871)
* add daily test case

* Update pr-run-test.yml

* Update daily-run-test.yml

* Update daily-run-test.yml

* Update pr-run-test.yml

* Update daily-run-test.yml

* Update oc_score_assert.py

* Update daily-run-test.yml

* Update daily-run-test.yml

* Update daily-run-test.yml

* update testcase baseline

* fix test case name

* add more models into daily test

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-02-05 15:52:00 +08:00
Fengzhe Zhou
fc84aff963
[CI] Update github workflow cuda image (#874)
* update workflow

* another trial

* another trial

* another trial
2024-02-05 15:22:59 +08:00
Yuchen Yan
fed7d800c6
[Fix] Fix error in gsm8k evaluator (#782)
Co-authored-by: jiangjin1999 <1261842974@qq.com>
2024-02-04 22:55:11 +08:00
bittersweet1999
7806cd0f64
[Feature] support alpacaeval (#809)
* support alpacaeval_v1

* Update opencompass/summarizers/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/summarizers/subjective/alpacaeval_v1.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* fix conflict

* support alpacaeval v2

* support alpacav2

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-02-04 14:18:36 +08:00
zhulinJulia24
0919b08ec8
[Feature] Add daily test case (#864)
* add daily test case

* Update pr-run-test.yml

* Update daily-run-test.yml

* Update daily-run-test.yml

* Update pr-run-test.yml

---------

Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
2024-02-02 12:03:05 +08:00
RunningLeon
4c87e777d8
[Feature] Add end_str for turbomind (#859)
* fix

* update

* fix internlm1

* fix docs

* remove sys
2024-02-01 22:31:14 +08:00
bittersweet1999
5c6dc908cd
fix compass arena (#854) 2024-01-30 16:34:38 +08:00
Guo Qipeng
4f78388c71
Update runtime.txt to fix rouge_chinese bugs. (#803)
* Update runtime.txt to fix rouge_chinese bugs.

the wheel file of rouge_chinese will overwrite the rouge package, causing bugs. Replacing it to the github code, which is the correct version.

* fix PEP format issues

* fix PEP format issues

* enable pip install

---------

Co-authored-by: 郭琦鹏 <guoqipeng@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-01-29 19:18:22 +08:00
del-zhenwu
e8067ac456
Create link-check.yml (#853)
* Create link-check.yml

* Update link-check.yml
2024-01-29 19:16:52 +08:00
Songyang Zhang
cdca59ff49
[Fix] Update Zhipu API and Fix issue min_out_len issue of API models (#847)
* Update zhipu api and fix min_out_len issue of API class

* Update example

* Update example
2024-01-28 14:52:43 +08:00
Jingming
2801883351
[Fix] Fix acc of IFEval (#849)
* [Feature] Add IFEval

* [Fix] Changing the Score Rule.
2024-01-27 22:27:07 +08:00
Xiaoming Shi
35aace776a
[Fix] Update MedBench (#845) 2024-01-26 17:56:13 +08:00
Songyang Zhang
8ed022b4c4
Update Sensetime API (#844) 2024-01-26 16:40:49 +08:00
Hubert
4aa74565e2
[Feat] minor update agent related (#839)
* [Feat] update cibench

* [Feat] Support CIBench

* [Feat] Support CIBench

* [Feat] Support CIBench

* [Feat] Support CIBench
2024-01-26 14:15:51 +08:00
bittersweet1999
77be07dbb5
[Fix] fix corev2 (#838)
* fix corev2

* fix corev2
2024-01-24 18:15:29 +08:00
Fengzhe Zhou
0991dd33a0
[Sync] Updata dataset cfg for internMath (#837)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-01-24 16:30:32 +08:00
zhulinJulia24
f7d7837ac0
add fail notify (#836) 2024-01-24 14:26:30 +08:00
Fengzhe Zhou
f367551668
update doc (#830) 2024-01-24 13:39:28 +08:00
Songyang Zhang
793e32c9cc
[Feature] Update API implementation (#834) 2024-01-24 13:35:21 +08:00
bittersweet1999
2ee8e8a1a1
[Feature] add mtbench (#829)
* add mtbench

* add mtbench

* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/mtbench.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* fix mtbench

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-24 12:11:47 +08:00
Jingming
e059a5c2bf
[Feature] Add IFEval (#813)
* [Feature] Add IFEval

* [Doc] add introduction of IFEval
2024-01-23 20:07:49 +08:00
bittersweet1999
3d9bb4aed7
[Fix] fix strings (#833)
* add compass arena

* add compass_arena

* add compass arena

* Update opencompass/summarizers/subjective/compass_arena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/summarizers/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/compass_arena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/eval_subjective_compassarena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/compassarena/compassarena_compare.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/eval_subjective_compassarena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/compassarena/compassarena_compare.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* fix check position bias

* fix string

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-23 10:57:26 +00:00
bittersweet1999
2d4da8dd02
[Feature] Add CompassArena (#828)
* add compass arena

* add compass_arena

* add compass arena

* Update opencompass/summarizers/subjective/compass_arena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/summarizers/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/compass_arena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/eval_subjective_compassarena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/compassarena/compassarena_compare.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/eval_subjective_compassarena.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/compassarena/compassarena_compare.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* fix check position bias

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-23 15:12:46 +08:00
RangiLyu
40a2441deb
Update hf_internlm2_chat template (#823)
* Update hf_internlm2_chat template

* Update 20B
2024-01-19 18:21:47 +08:00
Guo Qipeng
e975a96fa1
Update cdme config and evaluator (#812)
* update cdme config and evaluator

* fix cdme prompt

* move CDME trim post-processor as a separate evaluator

---------

Co-authored-by: 郭琦鹏 <guoqipeng@pjlab.org.cn>
2024-01-19 11:29:27 +08:00
Yang Yong
f09a2ff418
Add LightllmApi KeyError log & Update doc (#816)
* Add LightllmApi KeyError log

* Update LightllmApi doc
2024-01-18 22:23:38 +08:00
zhulinJulia24
8b5c467cc5
Test runner update - split step, change schedule time and disable hf cache (#814)
* Update pr-run-test.yml

* Update pr-run-test.yml

* Update pr-run-test.yml

* split step and change order, change schedule time and disable hf cache
2024-01-18 21:04:41 +08:00
Mo Li
dcc32ed856
[Fix] Update yi 200k config (#815) 2024-01-18 20:54:24 +08:00
RunningLeon
61fe873c89
[Fix] Fix turbomind and update docs (#808)
* update

* update docs

* add engine_config and gen_config in eval_config

* update

* fix

* fix

* fix

* fix docstr

* fix url
2024-01-18 14:41:35 +08:00
Fengzhe Zhou
9e5746d3d8
[Doc] Update News (#810) 2024-01-17 18:22:12 +08:00