bittersweet1999
|
68ca48496b
|
[Refactor] Reorganize subjective eval (#1284)
* fix pip version
* fix pip version
* reorganize subjective eval
* reorg sub
* reorg subeval
* reorg subeval
* update subjective doc
* reorg subeval
* reorg subeval
|
2024-07-05 22:11:37 +08:00 |
|
Fengzhe Zhou
|
a32f21a356
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
|
bittersweet1999
|
6ba1c4937d
|
[Feature] Support Math evaluation via judgemodel (#1094)
* support openai math evaluation
* support openai math evaluation
* support openai math evaluation
* support math llm judge
* support math llm judge
|
2024-04-26 14:56:23 +08:00 |
|
bittersweet1999
|
2d4e559763
|
[Feature] Add multi-model judge and fix some problems (#1016)
* support multi-model judge and moe judge
* test_moe
* test_moe
* test
* add moe judge
* support multi-judge-model
|
2024-04-02 11:52:06 +08:00 |
|
bittersweet1999
|
02e7eec911
|
[Feature] Support AlpacaEval_V2 (#1006)
* support alpacaeval_v2
* support alpacaeval
* update docs
* update docs
|
2024-03-28 16:49:04 +08:00 |
|
bittersweet1999
|
c78a4df923
|
add support for set prediction path (#984)
|
2024-03-19 14:32:15 +08:00 |
|
bittersweet1999
|
2ee8e8a1a1
|
[Feature] add mtbench (#829)
* add mtbench
* add mtbench
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/__init__.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/mtbench.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* fix mtbench
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
|
2024-01-24 12:11:47 +08:00 |
|
bittersweet1999
|
3c606cb712
|
quick fix for postprocess pred extraction (#771)
|
2024-01-05 21:10:18 +08:00 |
|
bittersweet1999
|
db919f0191
|
[Fix] SubSizePartition fix (#746)
* fix subjective_eval
* subject_eval partition situation fixed
* subject_eval partition situation fixed
|
2023-12-28 11:46:46 +08:00 |
|
bittersweet1999
|
dfd9ac0fd9
|
[Feature] Add other judgelm prompts for Alignbench (#731)
* add judgellm prompts
* add judgelm prompts
* update import info
* fix situation that no abbr in config
* fix situation that no abbr in config
* add summarizer for other judgellm
* change config name
* add maxlen
* add maxlen
* dict assert
* dict assert
* fix strings
* fix strings
|
2023-12-27 17:54:53 +08:00 |
|
bittersweet1999
|
e985100cd1
|
[Fix] Fix subjective alignbench (#730)
|
2023-12-23 20:06:53 +08:00 |
|
bittersweet1999
|
fbb912ddf3
|
[Feature] Add abbr for judgemodel in subjective evaluation (#724)
* add_judgemodel_abbr
* add judgemodel abbr
|
2023-12-21 15:58:20 +08:00 |
|
bittersweet1999
|
465308e430
|
[Feature] Add Subjective Evaluation (#680)
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
|
2023-12-11 22:22:11 +08:00 |
|
Leymore
|
fbf5089c40
|
[Sync] update github token (#475)
|
2023-10-13 06:50:54 -05:00 |
|