Haoran Que
|
4fe251729b
|
Upload HelloBench (#1607)
* upload hellobench
* update hellobench
* update readme.md
* update eval_hellobench.py
* update lastest
---------
Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>
|
2024-10-15 17:11:37 +08:00 |
|
bittersweet1999
|
fa54aa62f6
|
[Feature] Add Judgerbench and reorg subeval (#1593)
* fix pip version
* fix pip version
* update (#1522)
Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
* [Feature] Update Models (#1518)
* Update Models
* Update
* Update humanevalx
* Update
* Update
* [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527)
add judgerbench and reorg sub
add judgerbench and reorg subeval
add judgerbench and reorg subeval
* add judgerbench and reorg subeval
* add judgerbench and reorg subeval
* add judgerbench and reorg subeval
* add judgerbench and reorg subeval
---------
Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com>
Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn>
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com>
|
2024-10-15 16:36:05 +08:00 |
|
bittersweet1999
|
3f7a3730d7
|
[Fix] fix Flames (#1599)
* fix pip version
* fix pip version
* fix flames
* fix flames
|
2024-10-12 14:34:59 +08:00 |
|
bittersweet1999
|
7c7fa36235
|
[Feature] add support for internal Followbench (#1511)
* fix pip version
* fix pip version
* add internal followbench
* add internal followbench
* fix lint
* fix lint
|
2024-09-11 13:32:34 +08:00 |
|
bittersweet1999
|
1f9f728f22
|
[Feature] support compassbench Checklist evaluation (#1339)
* fix pip version
* fix pip version
* support checklist eval
* init
* add lan
* fix typo
|
2024-07-19 16:40:44 +08:00 |
|
Fengzhe Zhou
|
a32f21a356
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
|
klein
|
1fa62c4a42
|
Support wildbench (#1266)
Co-authored-by: Leymore <zfz-960727@163.com>
|
2024-06-24 13:16:27 +08:00 |
|
bittersweet1999
|
982e024540
|
[Feature] add dataset Fofo (#1224)
* add fofo dataset
* add dataset fofo
|
2024-06-06 11:40:48 +08:00 |
|
Xingyuan Bu
|
02a0a4e857
|
MT-Bench-101 (#1215)
* add mt-bench-101
* add readme and requirements
* add mt-bench-101 data
* Update readme_mtbench101.md
* update readme
* update leaderboard
* fix typo
* Update readme_mtbench101.md
* fit newest opencompass
* update readme.md
* mtbench101 to opencompass
* mtbench101 to opencompass
* for code review
* for code review
* for code review
* hook
* hook
---------
Co-authored-by: liujie <ljie@buaa.edu.cn>
|
2024-06-03 14:52:12 +08:00 |
|
Fengzhe Zhou
|
a77b8a5cec
|
[Sync] format (#1214)
|
2024-05-30 00:21:58 +08:00 |
|
bittersweet1999
|
e404b72c52
|
[Feature] support arenahard evaluation (#1096)
* support arenahard
* support arenahard
* support arenahard
|
2024-04-26 15:42:00 +08:00 |
|
bittersweet1999
|
2ee8e8a1a1
|
[Feature] add mtbench (#829)
* add mtbench
* add mtbench
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/__init__.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/mtbench.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* fix mtbench
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
|
2024-01-24 12:11:47 +08:00 |
|
bittersweet1999
|
2d4da8dd02
|
[Feature] Add CompassArena (#828)
* add compass arena
* add compass_arena
* add compass arena
* Update opencompass/summarizers/subjective/compass_arena.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/summarizers/subjective/__init__.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/compass_arena.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/__init__.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/eval_subjective_compassarena.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/datasets/subjective/compassarena/compassarena_compare.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/eval_subjective_compassarena.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/datasets/subjective/compassarena/compassarena_compare.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* fix check position bias
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
|
2024-01-23 15:12:46 +08:00 |
|
bittersweet1999
|
814b3f73bd
|
reorganize subject files (#801)
|
2024-01-16 18:03:11 +08:00 |
|