bittersweet1999
|
f0d436496e
|
[Update] update docs and add compassarena (#1614)
* fix pip version
* fix pip version
* update docs and add compassarena
* update docs
|
2024-10-17 14:39:06 +08:00 |
|
klein
|
65fad8e2ac
|
[Fix] minor update wildbench (#1335)
* update crb
* update crbbench
* update crbbench
* update crbbench
* minor update wildbench
* [Fix] Update doc of wildbench, and merge wildbench into subjective
* [Fix] Update doc of wildbench, and merge wildbench into subjective, fix crbbench
* Update crb.md
* Update crb_pair_judge.py
* Update crb_single_judge.py
* Update subjective_evaluation.md
* Update openai_api.py
* [Update] update wildbench readme
* [Update] update wildbench readme
* [Update] update wildbench readme, remove crb
* Delete configs/eval_subjective_wildbench_pair.py
* Delete configs/eval_subjective_wildbench_single.py
* Update __init__.py
---------
Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com>
|
2024-07-26 11:19:04 +08:00 |
|
bittersweet1999
|
68ca48496b
|
[Refactor] Reorganize subjective eval (#1284)
* fix pip version
* fix pip version
* reorganize subjective eval
* reorg sub
* reorg subeval
* reorg subeval
* update subjective doc
* reorg subeval
* reorg subeval
|
2024-07-05 22:11:37 +08:00 |
|
bittersweet1999
|
e404b72c52
|
[Feature] support arenahard evaluation (#1096)
* support arenahard
* support arenahard
* support arenahard
|
2024-04-26 15:42:00 +08:00 |
|
bittersweet1999
|
02e7eec911
|
[Feature] Support AlpacaEval_V2 (#1006)
* support alpacaeval_v2
* support alpacaeval
* update docs
* update docs
|
2024-03-28 16:49:04 +08:00 |
|
bittersweet1999
|
848e7c8a76
|
[fix] add different temp for different question in mtbench (#954)
* add temp for mtbench
* add document for mtbench
* add document for mtbench
|
2024-03-11 17:24:39 +08:00 |
|
bittersweet1999
|
97c2068bd9
|
[Feature] Add JudgeLLMs (#710)
* add judgellms
* add judgellms
* add sub_size_partition
* add docs
* add ref
|
2023-12-19 18:40:25 +08:00 |
|
Songyang Zhang
|
637628a70f
|
[Doc] Update Doc for Alignbench (#707)
* update alignmentbench
* update alignmentbench
* update doc
* update
* update
|
2023-12-15 15:07:25 +08:00 |
|
bittersweet1999
|
6130394165
|
[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692)
* add features
* add doc string
* add doc string
|
2023-12-12 20:58:17 +08:00 |
|
bittersweet1999
|
465308e430
|
[Feature] Add Subjective Evaluation (#680)
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
|
2023-12-11 22:22:11 +08:00 |
|
Wei Jueqi
|
b62842335d
|
[Doc] Update Subjective docs (#510)
* rename
* add en subdoc
* fix name
* fix writing
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
|
2023-10-27 16:27:24 +08:00 |
|