Fengzhe Zhou
|
a32f21a356
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
|
bittersweet1999
|
982e024540
|
[Feature] add dataset Fofo (#1224)
* add fofo dataset
* add dataset fofo
|
2024-06-06 11:40:48 +08:00 |
|
bittersweet1999
|
6ba1c4937d
|
[Feature] Support Math evaluation via judgemodel (#1094)
* support openai math evaluation
* support openai math evaluation
* support openai math evaluation
* support math llm judge
* support math llm judge
|
2024-04-26 14:56:23 +08:00 |
|
bittersweet1999
|
6f98c8d9ab
|
[Fix] Fix MultiRound Subjective Evaluation(#1043)
* fix multiround
* fix
|
2024-04-22 12:06:03 +08:00 |
|
Fengzhe Zhou
|
b39f501563
|
[Sync] update taco (#1030)
|
2024-04-09 17:50:23 +08:00 |
|
bittersweet1999
|
2d4e559763
|
[Feature] Add multi-model judge and fix some problems (#1016)
* support multi-model judge and moe judge
* test_moe
* test_moe
* test
* add moe judge
* support multi-judge-model
|
2024-04-02 11:52:06 +08:00 |
|
bittersweet1999
|
2ee8e8a1a1
|
[Feature] add mtbench (#829)
* add mtbench
* add mtbench
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/__init__.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* Update opencompass/datasets/subjective/mtbench.py
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
* fix mtbench
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
|
2024-01-24 12:11:47 +08:00 |
|
bittersweet1999
|
1fe152b3e8
|
[Feature] Support AlignmentBench infer and judge (#697)
* alignmentbench infer and judge
* alignmentbench
* alignmentbench done
* alignment all done
* alignment all done
|
2023-12-13 19:59:30 +08:00 |
|
bittersweet1999
|
6130394165
|
[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692)
* add features
* add doc string
* add doc string
|
2023-12-12 20:58:17 +08:00 |
|
bittersweet1999
|
465308e430
|
[Feature] Add Subjective Evaluation (#680)
* new version of subject
* fixed draw
* fixed draw
* fixed draw
* done
* done
* done
* done
* fixed lint
|
2023-12-11 22:22:11 +08:00 |
|
Leymore
|
fbf5089c40
|
[Sync] update github token (#475)
|
2023-10-13 06:50:54 -05:00 |
|
Tong Gao
|
a1ea3c094a
|
[Sync] Initial support of subjective evaluation (#421)
Co-authored-by: Leymore <zfz-960727@163.com>
|
2023-09-22 15:42:31 +08:00 |
|