OpenCompass/opencompass/openicl/icl_evaluator
bittersweet1999 2ee8e8a1a1
[Feature] add mtbench (#829)
* add mtbench

* add mtbench

* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update configs/datasets/subjective/multiround/mtbench_judgeby_gpt4.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/__init__.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* Update opencompass/datasets/subjective/mtbench.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* fix mtbench

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-24 12:11:47 +08:00
..
hf_metrics [Feature] Use local accuracy from hf implements (#416) 2023-09-20 16:35:22 +08:00
__init__.py [Sync] Sync with internal codes 2023.01.08 (#777) 2024-01-08 14:07:24 +00:00
icl_agent_evaluator.py Support GSM8k evaluation with tools by Lagent and LangChain (#277) 2023-09-22 15:28:22 +08:00
icl_aucroc_evaluator.py [Enhancement] Test linting in CI and fix existing linting errors (#69) 2023-07-17 15:59:10 +08:00
icl_base_evaluator.py [Enhancement] Test linting in CI and fix existing linting errors (#69) 2023-07-17 15:59:10 +08:00
icl_circular_evaluator.py [Sync] update configs (#734) 2023-12-25 21:59:16 +08:00
icl_em_evaluator.py [Sync] update (#517) 2023-10-27 20:31:22 +08:00
icl_hf_evaluator.py [Sync] Sync with internal codes 2023.01.08 (#777) 2024-01-08 14:07:24 +00:00
icl_jieba_rouge_evaluator.py fix jieba rouge (#467) 2023-10-12 10:25:19 +08:00
icl_misc_evaluator.py [Sync] Sync with internal codes 2023.01.08 (#777) 2024-01-08 14:07:24 +00:00
icl_toxic_evaluator.py [Fix] Fix CI (#70) 2023-07-17 19:10:59 +08:00
lm_evaluator.py [Feature] add mtbench (#829) 2024-01-24 12:11:47 +08:00