OpenCompass/opencompass/datasets/subjective/__init__.py

# flake8: noqa: F401, F403
from .alignbench import AlignmentBenchDataset  # noqa: F401, F403
from .alignbench import alignbench_postprocess  # noqa: F401, F403
from .alpacaeval import AlpacaEvalDataset  # noqa: F401, F403
from .alpacaeval import alpacaeval_postprocess  # noqa: F401, F403
from .arena_hard import ArenaHardDataset  # noqa: F401, F403
from .arena_hard import arenahard_postprocess  # noqa: F401, F403
from .compass_arena import CompassArenaDataset, compassarena_postprocess
from .compassbench import CompassBenchDataset  # noqa: F401, F403
from .compassbench_checklist import \
    CompassBenchCheklistDataset  # noqa: F401, F403
from .compassbench_control_length_bias import \
    CompassBenchControlLengthBiasDataset  # noqa: F401, F403
from .corev2 import Corev2Dataset  # noqa: F401, F403
from .creationbench import CreationBenchDataset  # noqa: F401, F403
from .flames import FlamesDataset  # noqa: F401, F403
from .fofo import FofoDataset, fofo_postprocess  # noqa: F401, F403
from .followbench import FollowBenchDataset  # noqa: F401, F403
from .followbench import followbench_postprocess
from .hellobench import *  # noqa: F401, F403
from .judgerbench import JudgerBenchDataset  # noqa: F401, F403
from .judgerbench import JudgerBenchEvaluator  # noqa: F401, F403
from .mtbench import MTBenchDataset, mtbench_postprocess  # noqa: F401, F403
from .mtbench101 import MTBench101Dataset  # noqa: F401, F403
from .mtbench101 import mtbench101_postprocess
from .multiround import MultiroundDataset  # noqa: F401, F403
from .subjective_cmp import SubjectiveCmpDataset  # noqa: F401, F403
from .wildbench import WildBenchDataset  # noqa: F401, F403
from .wildbench import wildbench_postprocess  # noqa: F401, F403
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`# flake8: noqa: F401, F403`
reorganize subject files (#801) 2024-01-16 18:03:11 +08:00			`from .alignbench import AlignmentBenchDataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .alignbench import alignbench_postprocess # noqa: F401, F403`
			`from .alpacaeval import AlpacaEvalDataset # noqa: F401, F403`
			`from .alpacaeval import alpacaeval_postprocess # noqa: F401, F403`
[Feature] support arenahard evaluation (#1096) * support arenahard * support arenahard * support arenahard 2024-04-26 15:42:00 +08:00			`from .arena_hard import ArenaHardDataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .arena_hard import arenahard_postprocess # noqa: F401, F403`
[Update] update docs and add compassarena (#1614) * fix pip version * fix pip version * update docs and add compassarena * update docs 2024-10-17 14:39:06 +08:00			`from .compass_arena import CompassArenaDataset, compassarena_postprocess`
[Sync] format (#1214) 2024-05-30 00:21:58 +08:00			`from .compassbench import CompassBenchDataset # noqa: F401, F403`
[Feature] support compassbench Checklist evaluation (#1339) * fix pip version * fix pip version * support checklist eval * init * add lan * fix typo 2024-07-19 16:40:44 +08:00			`from .compassbench_checklist import \`
			`CompassBenchCheklistDataset # noqa: F401, F403`
[Sync] Sync with internal codes 2024.06.28 (#1279) 2024-06-28 14:16:34 +08:00			`from .compassbench_control_length_bias import \`
			`CompassBenchControlLengthBiasDataset # noqa: F401, F403`
reorganize subject files (#801) 2024-01-16 18:03:11 +08:00			`from .corev2 import Corev2Dataset # noqa: F401, F403`
			`from .creationbench import CreationBenchDataset # noqa: F401, F403`
[Fix] fix Flames (#1599) * fix pip version * fix pip version * fix flames * fix flames 2024-10-12 14:34:59 +08:00			`from .flames import FlamesDataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .fofo import FofoDataset, fofo_postprocess # noqa: F401, F403`
[Feature] add support for internal Followbench (#1511) * fix pip version * fix pip version * add internal followbench * add internal followbench * fix lint * fix lint 2024-09-11 13:32:34 +08:00			`from .followbench import FollowBenchDataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .followbench import followbench_postprocess`
Upload HelloBench (#1607) * upload hellobench * update hellobench * update readme.md * update eval_hellobench.py * update lastest --------- Co-authored-by: bittersweet1999 <148421775+bittersweet1999@users.noreply.github.com> 2024-10-15 17:11:37 +08:00			`from .hellobench import * # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .judgerbench import JudgerBenchDataset # noqa: F401, F403`
			`from .judgerbench import JudgerBenchEvaluator # noqa: F401, F403`
			`from .mtbench import MTBenchDataset, mtbench_postprocess # noqa: F401, F403`
MT-Bench-101 (#1215) * add mt-bench-101 * add readme and requirements * add mt-bench-101 data * Update readme_mtbench101.md * update readme * update leaderboard * fix typo * Update readme_mtbench101.md * fit newest opencompass * update readme.md * mtbench101 to opencompass * mtbench101 to opencompass * for code review * for code review * for code review * hook * hook --------- Co-authored-by: liujie <ljie@buaa.edu.cn> 2024-06-03 14:52:12 +08:00			`from .mtbench101 import MTBench101Dataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .mtbench101 import mtbench101_postprocess`
reorganize subject files (#801) 2024-01-16 18:03:11 +08:00			`from .multiround import MultiroundDataset # noqa: F401, F403`
			`from .subjective_cmp import SubjectiveCmpDataset # noqa: F401, F403`
Support wildbench (#1266) Co-authored-by: Leymore <zfz-960727@163.com> 2024-06-24 13:16:27 +08:00			`from .wildbench import WildBenchDataset # noqa: F401, F403`
[Feature] Add Judgerbench and reorg subeval (#1593) * fix pip version * fix pip version * update (#1522) Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> * [Feature] Update Models (#1518) * Update Models * Update * Update humanevalx * Update * Update * [Feature] Dataset prompts update for ARC, BoolQ, Race (#1527) add judgerbench and reorg sub add judgerbench and reorg subeval add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval * add judgerbench and reorg subeval --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhulin1 <zhulin1@pjlab.org.cn> Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> Co-authored-by: Linchen Xiao <xxllcc1993@gmail.com> 2024-10-15 16:36:05 +08:00			`from .wildbench import wildbench_postprocess # noqa: F401, F403`