OpenCompass/opencompass/summarizers/subjective
Linchen Xiao 8e55c9c6ee
[Update] Compassbench v1.3 (#1396)
* stash files

* compassbench subjective evaluation added

* evaluation update

* fix lint

* update docs

* Update lint

* changes saved

* changes saved

* CompassBench subjective summarizer added (#1349)

* subjective summarizer added

* fix lint

[Fix] Fix MathBench (#1351)

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>

[Update] Update model support list (#1353)

* fix pip version

* fix pip version

* update model support

subjective summarizer updated

knowledge, math objective done (data need update)

remove secrets

objective changes saved

knowledge data added

* secrets removed

* changed added

* summarizer modified

* summarizer modified

* compassbench coding added

* fix lint

* objective summarizer updated

* compass_bench_v1.3 updated

* update files in config folder

* remove unused model

* lcbench modified

* removed model evaluation configs

* remove duplicated sdk implementation

---------

Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2024-08-12 19:09:19 +08:00
..
__init__.py [Feature] Update CHARM Memeorziation (#1230) 2024-07-26 18:42:30 +08:00
alignmentbench.py [Fix] add bc for alignbench summarizer (#1306) 2024-07-12 11:06:20 +08:00
all_obj.py [Fix] Fix Math Evaluation with Judge Model Evaluator & Add README (#1103) 2024-04-28 21:58:58 +08:00
alpacaeval.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
arenahard.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
charm.py [Feature] Update CHARM Memeorziation (#1230) 2024-07-26 18:42:30 +08:00
compass_arena.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
compassbench_v13.py [Update] Compassbench v1.3 (#1396) 2024-08-12 19:09:19 +08:00
compassbench.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
corev2.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
creationbench.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
flames.py [Feature] add support for Flames datasets (#1093) 2024-04-28 18:56:24 +08:00
fofo.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
mtbench101.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
mtbench.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
multiround.py [Fix] Fix MultiRound Subjective Evaluation(#1043) 2024-04-22 12:06:03 +08:00
subjective_post_process.py reorganize subject files (#801) 2024-01-16 18:03:11 +08:00
subjective.py Fix the summary error in subjective.py (#1363) 2024-07-25 18:36:13 +08:00
utils.py [Refactor] Reorganize subjective eval (#1284) 2024-07-05 22:11:37 +08:00
wildbench.py Support wildbench (#1266) 2024-06-24 13:16:27 +08:00