[Feature] Add huggingface apply_chat_template (#1098)

* add TheoremQA with 5-shot * add huggingface_above_v4_33 classes * use num_worker partitioner in cli * update theoremqa * update TheoremQA * add TheoremQA * rename theoremqa -> TheoremQA * update TheoremQA output path * rewrite many model configs * update huggingface * further update * refine configs * update configs * update configs * add configs/eval_llama3_instruct.py * add summarizer multi faceted * update bbh datasets * update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py * rename class * update readme * update hf above v4.33
2025-05-30 16:03:24 +08:00 · 2024-05-14 14:50:16 +08:00 · 2024-05-14 14:50:16 +08:00 · 7505b3cadf
commit 7505b3cadf
parent 6c711cb262
186 changed files with 1947 additions and 2910 deletions
--- a/README.md
+++ b/README.md
@ -162,20 +162,11 @@ python tools/list_configs.py llama mmlu
 You can also evaluate other HuggingFace models via command line. Taking LLaMA-7b as an example:

 ```bash
-python run.py --datasets ceval_ppl mmlu_ppl \
--hf-path huggyllama/llama-7b \  # HuggingFace model path
--model-kwargs device_map='auto' \  # Arguments for model construction
--tokenizer-kwargs padding_side='left' truncation='left' use_fast=False \  # Arguments for tokenizer construction
--max-out-len 100 \  # Maximum number of tokens generated
--max-seq-len 2048 \  # Maximum sequence length the model can accept
--batch-size 8 \  # Batch size
--no-batch-padding \  # Don't enable batch padding, infer through for loop to avoid performance loss
--num-gpus 1  # Number of minimum required GPUs
+python run.py --datasets ceval_ppl mmlu_ppl --hf-type base --hf-path huggyllama/llama-7b
 ```

 > \[!TIP\]
 >
-> To run the command above, you will need to remove the comments starting from `# ` first.
 > configuration with `_ppl` is designed for base model typically.
 > configuration with `_gen` can be used for both base model and chat model.

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@ -163,20 +163,9 @@ python tools/list_configs.py llama mmlu
 你也可以通过命令行去评测其它 HuggingFace 模型。同样以 LLaMA-7b 为例：

 ```bash
-python run.py --datasets ceval_ppl mmlu_ppl \
--hf-path huggyllama/llama-7b \  # HuggingFace 模型地址
--model-kwargs device_map='auto' \  # 构造 model 的参数
--tokenizer-kwargs padding_side='left' truncation='left' use_fast=False \  # 构造 tokenizer 的参数
--max-out-len 100 \  # 最长生成 token 数
--max-seq-len 2048 \  # 模型能接受的最大序列长度
--batch-size 8 \  # 批次大小
--no-batch-padding \  # 不打开 batch padding，通过 for loop 推理，避免精度损失
--num-gpus 1  # 运行该模型所需的最少 gpu 数
+python run.py --datasets ceval_ppl mmlu_ppl --hf-type base --hf-path huggyllama/llama-7b
 ```

-> **注意**<br />
-> 若需要运行上述命令，你需要删除所有从 `# ` 开始的注释。
-
 通过命令行或配置文件，OpenCompass 还支持评测 API 或自定义模型，以及更多样化的评测策略。请阅读[快速开始](https://opencompass.readthedocs.io/zh_CN/latest/get_started/quick_start.html)了解如何运行一个评测任务。

 更多教程请查看我们的[文档](https://opencompass.readthedocs.io/zh_CN/latest/index.html)。
--- a/configs/dataset_collections/chat_OC15.py
+++ b/configs/dataset_collections/chat_OC15.py
@ -0,0 +1,22 @@
+from mmengine.config import read_base
+
+with read_base():
+    from ..datasets.mmlu.mmlu_gen_4d595a import mmlu_datasets
+    from ..datasets.cmmlu.cmmlu_gen_c13365 import cmmlu_datasets
+    from ..datasets.ceval.ceval_gen_5f30c7 import ceval_datasets
+    from ..datasets.GaokaoBench.GaokaoBench_no_subjective_gen_4c31db import GaokaoBench_datasets
+    from ..datasets.triviaqa.triviaqa_wiki_1shot_gen_eaf81e import triviaqa_datasets
+    from ..datasets.nq.nq_open_1shot_gen_01cf41 import nq_datasets
+    from ..datasets.race.race_gen_69ee4f import race_datasets
+    from ..datasets.winogrande.winogrande_5shot_gen_b36770 import winogrande_datasets
+    from ..datasets.hellaswag.hellaswag_10shot_gen_e42710 import hellaswag_datasets
+    from ..datasets.bbh.bbh_gen_2879b0 import bbh_datasets
+    from ..datasets.gsm8k.gsm8k_gen_1d7fe4 import gsm8k_datasets
+    from ..datasets.math.math_0shot_gen_393424 import math_datasets
+    from ..datasets.TheoremQA.TheoremQA_5shot_gen_6f0af8 import TheoremQA_datasets
+    from ..datasets.humaneval.humaneval_gen_8e312c import humaneval_datasets
+    from ..datasets.mbpp.sanitized_mbpp_gen_830460 import sanitized_mbpp_datasets
+    from ..datasets.gpqa.gpqa_gen_4baadb import gpqa_datasets
+    from ..datasets.IFEval.IFEval_gen_3321a3 import ifeval_datasets
+
+datasets = sum((v for k, v in locals().items() if k.endswith("_datasets")), [])
--- a/configs/datasets/TheoremQA/TheoremQA_5shot_gen_a4f581.py
+++ b/configs/datasets/TheoremQA/TheoremQA_5shot_gen_a4f581.py
@ -1,46 +0,0 @@
-from opencompass.openicl.icl_prompt_template import PromptTemplate
-from opencompass.openicl.icl_retriever import ZeroRetriever
-from opencompass.openicl.icl_inferencer import GenInferencer
-from opencompass.datasets import HFDataset, TheoremQA_postprocess_v3, TheoremQAEvaluatorV3
-
-TheoremQA_reader_cfg = dict(input_columns=["Question", "Answer_type"], output_column="Answer", train_split="test", test_split="test")
-
-TheoremQA_infer_cfg = dict(
-    prompt_template=dict(
-        type=PromptTemplate,
-        template=dict(
-            round=[
-                dict(role='HUMAN', prompt='You are supposed to provide a solution to a given problem.\n\n\nProblem:\nIn a 10 Gigabit Ethernet network, the average size of a frame is 1500 bytes. If a burst of noise lasting 1ms interrupts the network, how many frames are lost?'),
-                dict(role='BOT', prompt='Solution:\nFirst, calculate the data rate in bytes/s:\n\n10 Gigabit/s * (1 Byte / 8 bits) = 1.25 * 10^9 Bytes/s\n\nNext, calculate the data loss in bytes due to the noise:\n\n1 ms * 1.25 * 10^9 Bytes/s = 1.25 * 10^6 Bytes\n\nFinally, divide the data loss by the average frame size to get the number of frames lost:\n\n1.25 * 10^6 Bytes / 1500 Bytes/frame ≈ 833.33 frames\nThe answer is 833.33'),
-                dict(role='HUMAN', prompt='\nProblem:\nGiven x = 0.157, what is the value of x \\times \\frac{\\prod_{n=1}^\\infty (1 - \\frac{x^2}{n^2 \\pi^2})}{\\sin(x)}?'),
-                dict(role='BOT', prompt="Solution:\nTo evaluate the expression $x \\times \\frac{\\prod_{n=1}^{\\infty} (1 - \\frac{x^2}{n^2 \\pi^2})}{\\sin(x)}$ given x = 0.157, we first recognize that the product in the numerator is related to the sine function through the Euler's reflection formula for the sine function, which can be expressed as:\n\n$$\\sin(x) = x \\prod_{n=1}^{\\infty} \\left(1 - \\frac{x^2}{n^2 \\pi^2}\\right)$$\n\nTherefore, the given expression simplifies to: $x \\times \\frac{\\sin(x)}{\\sin(x)}$\n\nBecause sin(x) in the numerator and denominator cancels out, the expression simplifies further to just x.\n\nSo, given x = 0.157, the value of the expression is 0.157. This result is derived from the properties of the sine function and does not require computational evaluation.\nThe answer is 0.157"),
-                dict(role='HUMAN', prompt='\nProblem:\nConsider the basis C of \\mathbb{R}^2 consisting of vectors u_1 = [2, 4] and u_2 = [1, -1]. If y = [8, 12], find the C-coordinate vector of y.'),
-                dict(role='BOT', prompt="Solution:\nThe goal is to express y as a linear comPbination of the basis vectors of C, i.e., $y = a\\cdot u_1 + b\\cdot u_2$, where a and b are the scalar coefficients that we want to find. These coefficients will form the C-coordinate vector of y, which we'll denote as $[a, b]_C$.\n\nGiven:\n- $u_1 = [2, 4]$,\n- $u_2 = [1, -1]$,\n- $y = [8, 12]$.\n\nWe need to solve the system of linear equations:\n2a + 1b = 8\n4a - 1b = 12\n\nLet's solve this system of equations to find a and b.\n\nThe solution to the system of equations is $a = \\frac{10}{3} and b = \\frac{4}{3}$. Therefore, the C-coordinate vector of y in the basis consisting of vectors u_1 = [2, 4] and u_2 = [1, -1] is $\\left[\\frac{10}{3}, \\frac{4}{3}\\right]_C$. \nLet's calculate the numerical value of $\\left[\x0crac{10}{3}, \x0crac{4}{3}\right]_C$ as [3.33, 1.33].\nThe answer is [3.33, 1.33]"),
-                dict(role='HUMAN', prompt='\nProblem:\nOne can draw a simple, connected planar graph with 200 vertices and 397 edges. Is this statement Trur or False?'),
-                dict(role='BOT', prompt="Solution:\nTo determine the answer, we can use Euler's formula for planar graphs, which states that for any finite, connected, planar graph, $V - E + F = 2$, where V is the number of vertices, E is the number of edges, and F is the number of faces.\n\nGiven the modified question, we have V = 200 vertices and E = 397 edges. We want to find if we can have a graph that satisfies these conditions, adhering to Euler's formula.\n\nFirst, let's rearrange Euler's formula to solve for F:  F = E - V + 2\n\nSubstituting the given values: F = 397 - 200 + 2,  F = 199\n\nThis means a graph with 200 vertices and 397 edges would have 199 faces. However, to determine the truth of this possibility, we should check if this graph doesn't violate any other planar graph constraints, particularly regarding the number of edges.\n\nFor a simple, connected planar graph, there's also a relationship between vertices, edges, and faces given by the inequality: $E \\leq 3V - 6$\n\nSubstituting V = 200 gives: $E \\leq 3*200 - 6 = 594$\n\nWith E = 397, the condition $E \\leq 594$ is satisfied, meaning it's theoretically possible in terms of the edge condition for a planar graph.\n\nTherefore, one can draw a simple, connected planar graph with 200 vertices and 397 edges, resulting in 199 faces, without violating the conditions for it to be planar according to both Euler's formula and the constraint on the maximum number of edges.\nThe answer is True"),
-                dict(role='HUMAN', prompt='\nProblem:\nGiven a finite group G, and a collection of permutations H on a set. Then (a) there always exists H such that G is isomorphic to H; (b) for any H, G is isomorphic to H; (c) G can never be isomorphic to H; (d) none of the above. Which option is correct?'),
-                dict(role='BOT', prompt="Solution:\nThis is based on Cayley's theorem, which states that every group G is isomorphic to a subgroup of the symmetric group acting on G. \nIn other words, for every finite group G, there exists a collection of permutations H (which in this context, can be thought of as the set of permutations representing the action of G on itself) such that G is isomorphic to H.\n\nTherefore, there always exists H such that G is isomorphic to H.\nThe answer is (a)"),
-                dict(role='HUMAN', prompt='\nProblem:\n{Question}'),
-                dict(role='BOT', prompt='Solution:\n{Answer}'),
-            ]
-        ),
-    ),
-    retriever=dict(type=ZeroRetriever),
-    inferencer=dict(type=GenInferencer, max_out_len=1024, stopping_criteria=["USER:", "ASSISTANT:",  "### Instruction:", "Response:", "<start_of_turn>", "[INST]", "Problem:"]),
-)
-
-TheoremQA_eval_cfg = dict(
-    evaluator=dict(type=TheoremQAEvaluatorV3),
-    pred_postprocessor=dict(type=TheoremQA_postprocess_v3)
-)
-
-TheoremQA_datasets = [
-    dict(
-        abbr="TheoremQA",
-        type=HFDataset,
-        path="TIGER-Lab/TheoremQA",
-        reader_cfg=TheoremQA_reader_cfg,
-        infer_cfg=TheoremQA_infer_cfg,
-        eval_cfg=TheoremQA_eval_cfg,
-    )
-]
--- a/configs/datasets/bbh/bbh_gen_2879b0.py
+++ b/configs/datasets/bbh/bbh_gen_2879b0.py
@ -0,0 +1,56 @@
+import os
+from mmengine.config import read_base
+from opencompass.openicl.icl_prompt_template import PromptTemplate
+from opencompass.openicl.icl_retriever import ZeroRetriever
+from opencompass.openicl.icl_inferencer import GenInferencer
+from opencompass.datasets import BBHDataset, bbh_mcq_postprocess, BBHEvaluator, BBHEvaluator_mcq
+
+with read_base():
+    from .bbh_subset_settings import settings
+
+bbh_datasets = []
+for name, test_type in settings:
+    with open(os.path.join(os.path.dirname(__file__), 'lib_prompt', f'{name}.txt'), 'r') as f:
+        hint = f.read()
+
+    task_prompt, body = hint.split('\n\nQ:', 1)
+    sections = ('Q:' + body).split('\n\n')
+    prompt_rounds = []
+    for index, section in enumerate(sections):
+        question, answer = section.split('\nA:')
+        answer = 'A:' + answer
+        if index == 0:
+            desc = task_prompt.strip() + '\n'
+        else:
+            desc = ''
+        prompt_rounds.append(dict(role="HUMAN", prompt=f"{desc}{question.strip()}"))
+        prompt_rounds.append(dict(role="BOT", prompt=answer.strip()))
+    prompt_rounds.append(dict(role="HUMAN", prompt="Q: {input}"))
+
+    bbh_reader_cfg = dict(input_columns=["input"], output_column="target")
+
+    bbh_infer_cfg = dict(
+        prompt_template=dict(type=PromptTemplate, template=dict(round=prompt_rounds)),
+        retriever=dict(type=ZeroRetriever),
+        inferencer=dict(type=GenInferencer, max_out_len=512))
+
+    if test_type == 'mcq':
+        bbh_eval_cfg = dict(
+            evaluator=dict(type=BBHEvaluator_mcq),
+            pred_role="BOT",
+            pred_postprocessor=dict(type=bbh_mcq_postprocess),
+            dataset_postprocessor=dict(type=bbh_mcq_postprocess))
+    else:
+        bbh_eval_cfg = dict(
+            evaluator=dict(type=BBHEvaluator),
+            pred_role="BOT")
+
+    bbh_datasets.append(
+        dict(
+            type=BBHDataset,
+            path="./data/BBH/data",
+            name=name,
+            abbr='bbh-' + name,
+            reader_cfg=bbh_reader_cfg.copy(),
+            infer_cfg=bbh_infer_cfg.copy(),
+            eval_cfg=bbh_eval_cfg.copy()))
--- a/configs/datasets/bbh/bbh_subset_settings.py
+++ b/configs/datasets/bbh/bbh_subset_settings.py
@ -0,0 +1,29 @@
+settings = [
+    ('temporal_sequences', 'mcq'),
+    ('disambiguation_qa', 'mcq'),
+    ('date_understanding', 'mcq'),
+    ('tracking_shuffled_objects_three_objects', 'mcq'),
+    ('penguins_in_a_table', 'mcq'),
+    ('geometric_shapes', 'mcq'),
+    ('snarks', 'mcq'),
+    ('ruin_names', 'mcq'),
+    ('tracking_shuffled_objects_seven_objects', 'mcq'),
+    ('tracking_shuffled_objects_five_objects', 'mcq'),
+    ('logical_deduction_three_objects', 'mcq'),
+    ('hyperbaton', 'mcq'),
+    ('logical_deduction_five_objects', 'mcq'),
+    ('logical_deduction_seven_objects', 'mcq'),
+    ('movie_recommendation', 'mcq'),
+    ('salient_translation_error_detection', 'mcq'),
+    ('reasoning_about_colored_objects', 'mcq'),
+    ('multistep_arithmetic_two', 'free_form'),
+    ('navigate', 'free_form'),
+    ('dyck_languages', 'free_form'),
+    ('word_sorting', 'free_form'),
+    ('sports_understanding', 'free_form'),
+    ('boolean_expressions', 'free_form'),
+    ('object_counting', 'free_form'),
+    ('formal_fallacies', 'free_form'),
+    ('causal_judgement', 'free_form'),
+    ('web_of_lies', 'free_form'),
+]
--- a/configs/datasets/collections/chat_medium.py
+++ b/configs/datasets/collections/chat_medium.py
@ -47,7 +47,7 @@ with read_base():
    from ..piqa.piqa_gen_1194eb import piqa_datasets
    from ..siqa.siqa_gen_e78df3 import siqa_datasets
    from ..strategyqa.strategyqa_gen_1180a7 import strategyqa_datasets
-    from ..winogrande.winogrande_gen_a9ede5 import winogrande_datasets
+    from ..winogrande.deprecated_winogrande_gen_a9ede5 import winogrande_datasets
    from ..obqa.obqa_gen_9069e4 import obqa_datasets
    from ..nq.nq_gen_c788f6 import nq_datasets
    from ..triviaqa.triviaqa_gen_2121ce import triviaqa_datasets
--- a/configs/datasets/collections/chat_small.py
+++ b/configs/datasets/collections/chat_small.py
@ -31,7 +31,7 @@ with read_base():
    from ..summedits.summedits_gen_315438 import summedits_datasets
    from ..hellaswag.hellaswag_gen_6faab5 import hellaswag_datasets
    from ..piqa.piqa_gen_1194eb import piqa_datasets
-    from ..winogrande.winogrande_gen_a9ede5 import winogrande_datasets
+    from ..winogrande.deprecated_winogrande_gen_a9ede5 import winogrande_datasets
    from ..obqa.obqa_gen_9069e4 import obqa_datasets
    from ..nq.nq_gen_c788f6 import nq_datasets
    from ..triviaqa.triviaqa_gen_2121ce import triviaqa_datasets
--- a/configs/datasets/winogrande/deprecated_winogrande_gen_a9ede5.py
+++ b/configs/datasets/winogrande/deprecated_winogrande_gen_a9ede5.py
--- a/configs/datasets/winogrande/winogrande_5shot_gen_b36770.py
+++ b/configs/datasets/winogrande/winogrande_5shot_gen_b36770.py
@ -0,0 +1,46 @@
+from opencompass.openicl.icl_prompt_template import PromptTemplate
+from opencompass.openicl.icl_retriever import FixKRetriever
+from opencompass.openicl.icl_inferencer import GenInferencer
+from opencompass.openicl.icl_evaluator import AccEvaluator
+from opencompass.datasets import winograndeDataset_V3
+from opencompass.utils.text_postprocessors import first_option_postprocess
+
+winogrande_reader_cfg = dict(
+    input_columns=["prompt", "only_option1", "only_option2"],
+    output_column="answer",
+    train_split="train_xs",
+    test_split="dev",
+)
+
+winogrande_infer_cfg = dict(
+    ice_template=dict(
+        type=PromptTemplate,
+        template=dict(
+            begin="</E>",
+            round=[
+                dict(role="HUMAN", prompt="Question: {prompt}\nA. {only_option1}\nB. {only_option2}\nAnswer:"),
+                dict(role="BOT", prompt="{answer}"),
+            ]
+        ),
+        ice_token="</E>",
+    ),
+    retriever=dict(type=FixKRetriever, fix_id_list=[0, 2, 4, 6, 8]),
+    inferencer=dict(type=GenInferencer),
+)
+
+winogrande_eval_cfg = dict(
+    evaluator=dict(type=AccEvaluator),
+    pred_role="BOT",
+    pred_postprocessor=dict(type=first_option_postprocess, options="AB"),
+)
+
+winogrande_datasets = [
+    dict(
+        abbr="winogrande",
+        type=winograndeDataset_V3,
+        path="./data/winogrande",
+        reader_cfg=winogrande_reader_cfg,
+        infer_cfg=winogrande_infer_cfg,
+        eval_cfg=winogrande_eval_cfg,
+    )
+]
--- a/configs/datasets/winogrande/winogrande_gen.py
+++ b/configs/datasets/winogrande/winogrande_gen.py
@ -1,4 +1,4 @@
 from mmengine.config import read_base

 with read_base():
-    from .winogrande_gen_a9ede5 import winogrande_datasets  # noqa: F401, F403
+    from .winogrande_gen_458220 import winogrande_datasets  # noqa: F401, F403
--- a/configs/datasets/winogrande/winogrande_gen_458220.py
+++ b/configs/datasets/winogrande/winogrande_gen_458220.py
@ -0,0 +1,41 @@
+from opencompass.openicl.icl_prompt_template import PromptTemplate
+from opencompass.openicl.icl_retriever import ZeroRetriever
+from opencompass.openicl.icl_inferencer import GenInferencer
+from opencompass.openicl.icl_evaluator import AccEvaluator
+from opencompass.datasets import winograndeDataset_V2
+from opencompass.utils.text_postprocessors import first_option_postprocess
+
+winogrande_reader_cfg = dict(
+    input_columns=["prompt", "only_option1", "only_option2"],
+    output_column="answer",
+)
+
+winogrande_infer_cfg = dict(
+    prompt_template=dict(
+        type=PromptTemplate,
+        template=dict(
+            round=[
+                dict(role="HUMAN", prompt="Question: {prompt}\nA. {only_option1}\nB. {only_option2}\nAnswer:"),
+            ]
+        ),
+    ),
+    retriever=dict(type=ZeroRetriever),
+    inferencer=dict(type=GenInferencer),
+)
+
+winogrande_eval_cfg = dict(
+    evaluator=dict(type=AccEvaluator),
+    pred_role="BOT",
+    pred_postprocessor=dict(type=first_option_postprocess, options='AB'),
+)
+
+winogrande_datasets = [
+    dict(
+        abbr="winogrande",
+        type=winograndeDataset_V2,
+        path='./data/winogrande',
+        reader_cfg=winogrande_reader_cfg,
+        infer_cfg=winogrande_infer_cfg,
+        eval_cfg=winogrande_eval_cfg,
+    )
+]
--- a/configs/eval_llama3_instruct.py
+++ b/configs/eval_llama3_instruct.py
@ -0,0 +1,52 @@
+from mmengine.config import read_base
+
+with read_base():
+    from .dataset_collections.chat_OC15 import datasets
+
+    from .models.hf_llama.hf_llama3_8b_instruct import models as hf_llama3_8b_instruct_model
+
+    from .summarizers.chat_OC15 import summarizer
+
+
+work_dir = 'outputs/debug/llama3-instruct'
+
+models = sum([v for k, v in locals().items() if k.endswith("_model")], [])
+
+# dataset               version    metric                        mode    llama-3-8b-instruct-hf
+# --------------------  ---------  ----------------------------  ------  ------------------------
+# average               -          naive_average                 gen     55.64
+# mmlu                  -          naive_average                 gen     68.30
+# cmmlu                 -          naive_average                 gen     53.29
+# ceval                 -          naive_average                 gen     52.32
+# GaokaoBench           -          weighted_average              gen     45.91
+# triviaqa_wiki_1shot   eaf81e     score                         gen     79.01
+# nq_open_1shot         01cf41     score                         gen     30.25
+# race-high             9a54b6     accuracy                      gen     81.22
+# winogrande            b36770     accuracy                      gen     66.46
+# hellaswag             e42710     accuracy                      gen     74.33
+# bbh                   -          naive_average                 gen     67.25
+# gsm8k                 1d7fe4     accuracy                      gen     79.08
+# math                  393424     accuracy                      gen     27.78
+# TheoremQA             6f0af8     score                         gen     19.50
+# openai_humaneval      8e312c     humaneval_pass@1              gen     55.49
+# sanitized_mbpp        830460     score                         gen     66.54
+# GPQA_diamond          4baadb     accuracy                      gen     25.76
+# IFEval                3321a3     Prompt-level-strict-accuracy  gen     67.84
+#                       -          -                             -       -
+# mmlu                  -          naive_average                 gen     68.30
+# mmlu-stem             -          naive_average                 gen     57.92
+# mmlu-social-science   -          naive_average                 gen     77.83
+# mmlu-humanities       -          naive_average                 gen     71.20
+# mmlu-other            -          naive_average                 gen     71.79
+# cmmlu                 -          naive_average                 gen     53.29
+# cmmlu-stem            -          naive_average                 gen     45.40
+# cmmlu-social-science  -          naive_average                 gen     54.63
+# cmmlu-humanities      -          naive_average                 gen     54.14
+# cmmlu-other           -          naive_average                 gen     59.52
+# cmmlu-china-specific  -          naive_average                 gen     49.33
+# ceval                 -          naive_average                 gen     52.32
+# ceval-stem            -          naive_average                 gen     48.16
+# ceval-social-science  -          naive_average                 gen     57.50
+# ceval-humanities      -          naive_average                 gen     53.26
+# ceval-other           -          naive_average                 gen     54.26
+# ceval-hard            -          naive_average                 gen     35.59
--- a/configs/models/aquila/hf_aquila2_34b.py
+++ b/configs/models/aquila/hf_aquila2_34b.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='aquila2-34b-hf',
-        path="BAAI/Aquila2-34B",
-        tokenizer_path='BAAI/Aquila2-34B',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='BAAI/Aquila2-34B',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/aquila/hf_aquila2_7b.py
+++ b/configs/models/aquila/hf_aquila2_7b.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='aquila2-7b-hf',
-        path="BAAI/Aquila2-7B",
-        tokenizer_path='BAAI/Aquila2-7B',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='BAAI/Aquila2-7B',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/aquila/hf_aquilachat2_34b.py
+++ b/configs/models/aquila/hf_aquilachat2_34b.py
@ -5,7 +5,6 @@ _meta_template = dict(
        dict(role='HUMAN', begin='### Human: ', end='\n'),
        dict(role='BOT', begin='### Assistant: ', end='</s>', generate=True),
    ],
-    eos_token_id=100007,
 )

 models = [
--- a/configs/models/aquila/hf_aquilachat2_34b_16k.py
+++ b/configs/models/aquila/hf_aquilachat2_34b_16k.py
@ -6,7 +6,6 @@ _meta_template = dict(
        dict(role='HUMAN', begin='Human: ', end='###'),
        dict(role='BOT', begin='Assistant: ', end='</s>', generate=True),
    ],
-    eos_token_id=100007,
 )

 models = [
--- a/configs/models/aquila/hf_aquilachat2_7b.py
+++ b/configs/models/aquila/hf_aquilachat2_7b.py
@ -5,7 +5,6 @@ _meta_template = dict(
        dict(role='HUMAN', begin='<|startofpiece|>', end=''),
        dict(role='BOT', begin='<|endofpiece|>', end='</s>', generate=True),
    ],
-    eos_token_id=2,
 )

 models = [
--- a/configs/models/aquila/hf_aquilachat2_7b_16k.py
+++ b/configs/models/aquila/hf_aquilachat2_7b_16k.py
@ -6,7 +6,6 @@ _meta_template = dict(
        dict(role='HUMAN', begin='Human: ', end='###'),
        dict(role='BOT', begin='Assistant: ', end='</s>', generate=True),
    ],
-    eos_token_id=100007,
 )

 models = [
--- a/configs/models/baichuan/hf_baichuan2_13b_base.py
+++ b/configs/models/baichuan/hf_baichuan2_13b_base.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='baichuan2-13b-base-hf',
-        path="baichuan-inc/Baichuan2-13B-Base",
-        tokenizer_path='baichuan-inc/Baichuan2-13B-Base',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='baichuan-inc/Baichuan2-13B-Base',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto', trust_remote_code=True),
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/baichuan/hf_baichuan2_7b_base.py
+++ b/configs/models/baichuan/hf_baichuan2_7b_base.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='baichuan2-7b-base-hf',
-        path="baichuan-inc/Baichuan2-7B-Base",
-        tokenizer_path='baichuan-inc/Baichuan2-7B-Base',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='baichuan-inc/Baichuan2-7B-Base',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto', trust_remote_code=True),
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/bluelm/hf_bluelm_7b_base.py
+++ b/configs/models/bluelm/hf_bluelm_7b_base.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='bluelm-7b-base-hf',
-        path="vivo-ai/BlueLM-7B-Base",
-        tokenizer_path='vivo-ai/BlueLM-7B-Base',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='vivo-ai/BlueLM-7B-Base',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/bluelm/hf_bluelm_7b_base_32k.py
+++ b/configs/models/bluelm/hf_bluelm_7b_base_32k.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='bluelm-7b-base-32k-hf',
-        path="vivo-ai/BlueLM-7B-Base-32K",
-        tokenizer_path='vivo-ai/BlueLM-7B-Base-32K',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=4096,
+        path='vivo-ai/BlueLM-7B-Base-32K',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/chatglm/hf_chatglm3_6b.py
+++ b/configs/models/chatglm/hf_chatglm3_6b.py
@ -1,31 +1,12 @@
-from opencompass.models import HuggingFaceChatGLM3
-
-api_meta_template = dict(
-    round=[
-        dict(role='HUMAN', api_role='HUMAN'),
-        dict(role='BOT', api_role='BOT', generate=True),
-    ]
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceChatGLM3,
+        type=HuggingFacewithChatTemplate,
        abbr='chatglm3-6b-hf',
        path='THUDM/chatglm3-6b',
-        tokenizer_path='THUDM/chatglm3-6b',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        meta_template=api_meta_template,
-        max_out_len=100,
-        max_seq_len=4096,
-        batch_size=1,
-        run_cfg=dict(num_gpus=1, num_procs=1)
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/chatglm/hf_chatglm3_6b_32k.py
+++ b/configs/models/chatglm/hf_chatglm3_6b_32k.py
@ -1,31 +1,12 @@
-from opencompass.models import HuggingFaceChatGLM3
-
-api_meta_template = dict(
-    round=[
-        dict(role='HUMAN', api_role='HUMAN'),
-        dict(role='BOT', api_role='BOT', generate=True),
-    ]
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceChatGLM3,
+        type=HuggingFacewithChatTemplate,
        abbr='chatglm3-6b-32k-hf',
        path='THUDM/chatglm3-6b-32k',
-        tokenizer_path='THUDM/chatglm3-6b-32k',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        meta_template=api_meta_template,
-        max_out_len=100,
-        max_seq_len=4096,
-        batch_size=1,
-        run_cfg=dict(num_gpus=1, num_procs=1)
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/chatglm/hf_chatglm3_6b_base.py
+++ b/configs/models/chatglm/hf_chatglm3_6b_base.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFace
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFace,
+        type=HuggingFaceBaseModel,
        abbr='chatglm3-6b-base-hf',
        path='THUDM/chatglm3-6b-base',
-        tokenizer_path='THUDM/chatglm3-6b-base',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-           padding_side='left',
-           truncation_side='left',
-           trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=4096,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/codellama/hf_codellama_13b.py
+++ b/configs/models/codellama/hf_codellama_13b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 13B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-13b',
-        path="codellama/CodeLlama-13b-hf",
-        tokenizer_path='codellama/CodeLlama-13b-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-13b-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=2, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_13b_instruct.py
+++ b/configs/models/codellama/hf_codellama_13b_instruct.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
-    # CodeLlama 13B Instruct
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='CodeLlama-13b-Instruct',
-        path="codellama/CodeLlama-13b-Instruct-hf",
-        tokenizer_path='codellama/CodeLlama-13b-Instruct-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-13b-Instruct-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=2, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_13b_python.py
+++ b/configs/models/codellama/hf_codellama_13b_python.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 13B Python
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-13b-Python',
-        path="codellama/CodeLlama-13b-Python-hf",
-        tokenizer_path='codellama/CodeLlama-13b-Python-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-13b-Python-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=2, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_34b.py
+++ b/configs/models/codellama/hf_codellama_34b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 34B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-34b',
-        path="codellama/CodeLlama-34b-hf",
-        tokenizer_path='codellama/CodeLlama-34b-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-34b-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=4, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=2),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_34b_instruct.py
+++ b/configs/models/codellama/hf_codellama_34b_instruct.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
-    # CodeLlama 34B Instruct
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='CodeLlama-34b-Instruct',
-        path="codellama/CodeLlama-34b-Instruct-hf",
-        tokenizer_path='codellama/CodeLlama-34b-Instruct-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-34b-Instruct-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=4, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=2),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_34b_python.py
+++ b/configs/models/codellama/hf_codellama_34b_python.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 34B Python
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-34b-Python',
-        path="codellama/CodeLlama-34b-Python-hf",
-        tokenizer_path='codellama/CodeLlama-34b-Python-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-34b-Python-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=4, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=2),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_70b.py
+++ b/configs/models/codellama/hf_codellama_70b.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFaceBaseModel
+
+models = [
+    dict(
+        type=HuggingFaceBaseModel,
+        abbr='CodeLlama-70b',
+        path='codellama/CodeLlama-70b-hf',
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=4),
+    )
+]
--- a/configs/models/codellama/hf_codellama_70b_instruct.py
+++ b/configs/models/codellama/hf_codellama_70b_instruct.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='CodeLlama-70b-Instruct',
+        path='codellama/CodeLlama-70b-Instruct-hf',
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=4),
+    )
+]
--- a/configs/models/codellama/hf_codellama_70b_python.py
+++ b/configs/models/codellama/hf_codellama_70b_python.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFaceBaseModel
+
+models = [
+    dict(
+        type=HuggingFaceBaseModel,
+        abbr='CodeLlama-70b-Python',
+        path='codellama/CodeLlama-70b-Python-hf',
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=4),
+    )
+]
--- a/configs/models/codellama/hf_codellama_7b.py
+++ b/configs/models/codellama/hf_codellama_7b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 7B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-7b',
-        path="codellama/CodeLlama-7b-hf",
-        tokenizer_path='codellama/CodeLlama-7b-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-7b-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=1, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_7b_instruct.py
+++ b/configs/models/codellama/hf_codellama_7b_instruct.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
-    # CodeLlama 7B Instruct
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='CodeLlama-7b-Instruct',
-        path="codellama/CodeLlama-7b-Instruct-hf",
-        tokenizer_path='codellama/CodeLlama-7b-Instruct-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-7b-Instruct-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=1, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/codellama/hf_codellama_7b_python.py
+++ b/configs/models/codellama/hf_codellama_7b_python.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # CodeLlama 7B Python
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='CodeLlama-7b-Python',
-        path="codellama/CodeLlama-7b-Python-hf",
-        tokenizer_path='codellama/CodeLlama-7b-Python-hf',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
+        path='codellama/CodeLlama-7b-Python-hf',
        max_out_len=1024,
-        max_seq_len=2048,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=1, num_procs=1),
-    ),
+        run_cfg=dict(num_gpus=1),
+    )
 ]
--- a/configs/models/deepseek/hf_deepseek_67b_base.py
+++ b/configs/models/deepseek/hf_deepseek_67b_base.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='deepseek-67b-base-hf',
-        path="deepseek-ai/deepseek-llm-67b-base",
-        tokenizer_path='deepseek-ai/deepseek-llm-67b-base',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-llm-67b-base',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_67b_chat.py
+++ b/configs/models/deepseek/hf_deepseek_67b_chat.py
@ -1,33 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    begin='<｜begin▁of▁sentence｜>',
-    round=[
-        dict(role="HUMAN", begin='User: ', end='\n\n'),
-        dict(role="BOT", begin="Assistant: ", end='<｜end▁of▁sentence｜>', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-67b-chat-hf',
-        path="deepseek-ai/deepseek-llm-67b-chat",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-llm-67b-chat',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=4, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_7b_base.py
+++ b/configs/models/deepseek/hf_deepseek_7b_base.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='deepseek-7b-base-hf',
-        path="deepseek-ai/deepseek-llm-7b-base",
-        tokenizer_path='deepseek-ai/deepseek-llm-7b-base',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-llm-7b-base',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_7b_chat.py
+++ b/configs/models/deepseek/hf_deepseek_7b_chat.py
@ -1,33 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    begin='<｜begin▁of▁sentence｜>',
-    round=[
-        dict(role="HUMAN", begin='User: ', end='\n\n'),
-        dict(role="BOT", begin="Assistant: ", end='<｜end▁of▁sentence｜>', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-7b-chat-hf',
-        path="deepseek-ai/deepseek-llm-7b-chat",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-llm-7b-chat',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_coder_1_3b_instruct.py
+++ b/configs/models/deepseek/hf_deepseek_coder_1_3b_instruct.py
@ -1,34 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='### Instruction:\n', end='\n'),
-        dict(role="BOT", begin="### Response:\n", end='<|EOT|>', generate=True),
-    ],
-    eos_token_id=100001,
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-coder-1.3b-hf',
-        path="deepseek-ai/deepseek-coder-1.3b-instruct",
-        tokenizer_path='deepseek-ai/deepseek-coder-1.3b-instruct',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=2048,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-coder-1.3b-instruct',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|EOT|>',
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_coder_33b_instruct.py
+++ b/configs/models/deepseek/hf_deepseek_coder_33b_instruct.py
@ -1,34 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='### Instruction:\n', end='\n'),
-        dict(role="BOT", begin="### Response:\n", end='<|EOT|>', generate=True),
-    ],
-    eos_token_id=100001,
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-coder-33b-hf',
-        path="deepseek-ai/deepseek-coder-33b-instruct",
-        tokenizer_path='deepseek-ai/deepseek-coder-33b-instruct',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=2048,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-coder-33b-instruct',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=4, num_procs=1),
-        end_str='<|EOT|>',
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_coder_6_7b_instruct.py
+++ b/configs/models/deepseek/hf_deepseek_coder_6_7b_instruct.py
@ -1,34 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='### Instruction:\n', end='\n'),
-        dict(role="BOT", begin="### Response:\n", end='<|EOT|>', generate=True),
-    ],
-    eos_token_id=100001,
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-coder-6.7b-hf',
-        path="deepseek-ai/deepseek-coder-6.7b-instruct",
-        tokenizer_path='deepseek-ai/deepseek-coder-6.7b-instruct',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=2048,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-coder-6.7b-instruct',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|EOT|>',
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_moe_16b_base.py
+++ b/configs/models/deepseek/hf_deepseek_moe_16b_base.py
@ -1,24 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='deepseek-moe-16b-base-hf',
-        path="deepseek-ai/deepseek-moe-16b-base",
-        tokenizer_path='deepseek-ai/deepseek-moe-16b-base',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=3,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-moe-16b-base',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/deepseek/hf_deepseek_moe_16b_chat.py
+++ b/configs/models/deepseek/hf_deepseek_moe_16b_chat.py
@ -1,32 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    begin='<｜begin▁of▁sentence｜>',
-    round=[
-        dict(role="HUMAN", begin='User: ', end='\n\n'),
-        dict(role="BOT", begin="Assistant: ", end='<｜end▁of▁sentence｜>', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='deepseek-moe-16b-chat-hf',
-        path="deepseek-ai/deepseek-moe-16b-chat",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='deepseek-ai/deepseek-moe-16b-chat',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/falcon/hf_falcon_40b.py
+++ b/configs/models/falcon/hf_falcon_40b.py
@ -1,21 +1,12 @@
-# Only torch >=2.0 is supported for falcon-40b
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='falcon-40b-hf',
        path='tiiuae/falcon-40b',
-        tokenizer_path='tiiuae/falcon-40b',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto', revision='561820f7eef0cc56a31ea38af15ca1acb07fab5d'),
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/falcon/hf_falcon_7b.py
+++ b/configs/models/falcon/hf_falcon_7b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='falcon-7b-hf',
        path='tiiuae/falcon-7b',
-        tokenizer_path='tiiuae/falcon-7b',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto', revision='2f5c3cd4eace6be6c0f12981f377fb35e5bf6ee5'),
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/gemma/hf_gemma_2b.py
+++ b/configs/models/gemma/hf_gemma_2b.py
@ -1,23 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='gemma-2b-hf',
-        path="google/gemma-2b",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='google/gemma-2b',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/gemma/hf_gemma_2b_it.py
+++ b/configs/models/gemma/hf_gemma_2b_it.py
@ -1,33 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='<start_of_turn>user\n', end='<end_of_turn>\n'),
-        dict(role="BOT", begin="<start_of_turn>model\n", end='<end_of_turn>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='gemma-2b-it-hf',
-        path="google/gemma-2b-it",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        min_out_len=1,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='google/gemma-2b-it',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/gemma/hf_gemma_7b.py
+++ b/configs/models/gemma/hf_gemma_7b.py
@ -1,23 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='gemma-7b-hf',
-        path="google/gemma-7b",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='google/gemma-7b',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/gemma/hf_gemma_7b_it.py
+++ b/configs/models/gemma/hf_gemma_7b_it.py
@ -1,33 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='<start_of_turn>user\n', end='<end_of_turn>\n'),
-        dict(role="BOT", begin="<start_of_turn>model\n", end='<end_of_turn>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='gemma-7b-it-hf',
-        path="google/gemma-7b-it",
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        min_out_len=1,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='google/gemma-7b-it',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_1_8b.py
+++ b/configs/models/hf_internlm/hf_internlm2_1_8b.py
@ -1,26 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm2-1.8b-hf',
        path="internlm/internlm2-1_8b",
-        tokenizer_path='internlm/internlm2-1_8b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=1,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_20b.py
+++ b/configs/models/hf_internlm/hf_internlm2_20b.py
@ -1,26 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm2-20b-hf',
        path="internlm/internlm2-20b",
-        tokenizer_path='internlm/internlm2-20b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=1,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_7b.py
+++ b/configs/models/hf_internlm/hf_internlm2_7b.py
@ -1,26 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm2-7b-hf',
        path="internlm/internlm2-7b",
-        tokenizer_path='internlm/internlm2-7b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=1,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_base_20b.py
+++ b/configs/models/hf_internlm/hf_internlm2_base_20b.py
@ -1,26 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm2-base-20b-hf',
        path="internlm/internlm2-base-20b",
-        tokenizer_path='internlm/internlm2-base-20b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=1,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_base_7b.py
+++ b/configs/models/hf_internlm/hf_internlm2_base_7b.py
@ -1,26 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm2-base-7b-hf',
        path="internlm/internlm2-base-7b",
-        tokenizer_path='internlm/internlm2-base-7b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        min_out_len=1,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_1_8b.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_1_8b.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-1.8b-hf',
-        path="internlm/internlm2-chat-1_8b",
-        tokenizer_path='internlm/internlm2-chat-1_8b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-1_8b',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_1_8b_sft.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_1_8b_sft.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-1.8b-sft-hf',
-        path="internlm/internlm2-chat-1_8b-sft",
-        tokenizer_path='internlm/internlm2-chat-1_8b-sft',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-1_8b-sft',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_20b.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_20b.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-20b-hf',
-        path="internlm/internlm2-chat-20b",
-        tokenizer_path='internlm/internlm2-chat-20b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-20b',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=2, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=2),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_20b_sft.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_20b_sft.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-20b-sft-hf',
-        path="internlm/internlm2-chat-20b-sft",
-        tokenizer_path='internlm/internlm2-chat-20b-sft',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-20b-sft',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=2, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=2),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_7b.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_7b.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-7b-hf',
-        path="internlm/internlm2-chat-7b",
-        tokenizer_path='internlm/internlm2-chat-7b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-7b',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_7b_sft.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_7b_sft.py
@ -1,36 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|im_start|>user\n', end='<|im_end|>\n'),
-        dict(role='BOT', begin='<|im_start|>assistant\n', end='<|im_end|>\n', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-7b-sft-hf',
-        path="internlm/internlm2-chat-7b-sft",
-        tokenizer_path='internlm/internlm2-chat-7b-sft',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-chat-7b-sft',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<|im_end|>',
-        generation_kwargs = {"eos_token_id": [2, 92542]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_math_20b.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_math_20b.py
@ -1,35 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='[UNUSED_TOKEN_146]user\n', end='[UNUSED_TOKEN_145]\n'),
-        dict(role='BOT', begin='[UNUSED_TOKEN_146]assistant\n', end='[UNUSED_TOKEN_145]\n', generate=True),
-    ],
-    eos_token_id=92542
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-math-20b-hf',
-        path="internlm/internlm2-math-20b",
-        tokenizer_path='internlm/internlm2-math-20b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-math-20b',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=2, num_procs=1),
-        end_str='[UNUSED_TOKEN_145]',
+        run_cfg=dict(num_gpus=2),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_math_20b_with_system.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_math_20b_with_system.py
@ -7,7 +7,6 @@ _meta_template = dict(
        dict(role='SYSTEM', begin='[UNUSED_TOKEN_146]system\n', end='[UNUSED_TOKEN_145]\n'),
        dict(role='BOT', begin='[UNUSED_TOKEN_146]assistant\n', end='[UNUSED_TOKEN_145]\n', generate=True),
    ],
-    eos_token_id=92542
 )

 models = [
--- a/configs/models/hf_internlm/hf_internlm2_chat_math_7b.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_math_7b.py
@ -1,35 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='[UNUSED_TOKEN_146]user\n', end='[UNUSED_TOKEN_145]\n'),
-        dict(role='BOT', begin='[UNUSED_TOKEN_146]assistant\n', end='[UNUSED_TOKEN_145]\n', generate=True),
-    ],
-    eos_token_id=92542
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='internlm2-chat-math-7b-hf',
-        path="internlm/internlm2-math-7b",
-        tokenizer_path='internlm/internlm2-math-7b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='internlm/internlm2-math-7b',
+        max_out_len=1024,
        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='[UNUSED_TOKEN_145]',
+        run_cfg=dict(num_gpus=1),
+        stop_words=['</s>', '<|im_end|>'],
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm2_chat_math_7b_with_system.py
+++ b/configs/models/hf_internlm/hf_internlm2_chat_math_7b_with_system.py
@ -7,7 +7,6 @@ _meta_template = dict(
        dict(role='SYSTEM', begin='[UNUSED_TOKEN_146]system\n', end='[UNUSED_TOKEN_145]\n'),
        dict(role='BOT', begin='[UNUSED_TOKEN_146]assistant\n', end='[UNUSED_TOKEN_145]\n', generate=True),
    ],
-    eos_token_id=92542
 )

 models = [
--- a/configs/models/hf_internlm/hf_internlm2_math_20b.py
+++ b/configs/models/hf_internlm/hf_internlm2_math_20b.py
@ -0,0 +1,13 @@
+from opencompass.models import HuggingFaceBaseModel
+
+
+models = [
+    dict(
+        type=HuggingFaceBaseModel,
+        abbr='internlm2-math-20b-hf',
+        path="internlm/internlm2-math-20b",
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=2),
+    )
+]
--- a/configs/models/hf_internlm/hf_internlm2_math_7b.py
+++ b/configs/models/hf_internlm/hf_internlm2_math_7b.py
@ -0,0 +1,13 @@
+from opencompass.models import HuggingFaceBaseModel
+
+
+models = [
+    dict(
+        type=HuggingFaceBaseModel,
+        abbr='internlm2-math-7b-hf',
+        path="internlm/internlm2-math-7b",
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
--- a/configs/models/hf_internlm/hf_internlm_20b.py
+++ b/configs/models/hf_internlm/hf_internlm_20b.py
@ -1,22 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm-20b-hf',
        path="internlm/internlm-20b",
-        tokenizer_path='internlm/internlm-20b',
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(trust_remote_code=True, device_map='auto'),
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm_7b.py
+++ b/configs/models/hf_internlm/hf_internlm_7b.py
@ -1,25 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='internlm-7b-hf',
        path="internlm/internlm-7b",
-        tokenizer_path='internlm/internlm-7b',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_internlm/hf_internlm_chat_7b_8k.py
+++ b/configs/models/hf_internlm/hf_internlm_chat_7b_8k.py
@ -1,34 +0,0 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|User|>:', end='\n'),
-        dict(role='BOT', begin='<|Bot|>:', end='<eoa>\n', generate=True),
-    ],
-)
-
-models = [
-    dict(
-        type=HuggingFaceCausalLM,
-        abbr='internlm-chat-7b-8k-hf',
-        path="internlm/internlm-chat-7b-8k",
-        tokenizer_path='internlm/internlm-chat-7b-8k',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
-        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<eoa>',
-    )
-]
--- a/configs/models/hf_internlm/hf_internlm_chat_7b_v1_1.py
+++ b/configs/models/hf_internlm/hf_internlm_chat_7b_v1_1.py
@ -1,34 +0,0 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    round=[
-        dict(role='HUMAN', begin='<|User|>:', end='\n'),
-        dict(role='BOT', begin='<|Bot|>:', end='<eoa>\n', generate=True),
-    ],
-)
-
-models = [
-    dict(
-        type=HuggingFaceCausalLM,
-        abbr='internlm-chat-7b-v1.1-hf',
-        path="internlm/internlm-chat-7b-v1_1",
-        tokenizer_path='internlm/internlm-chat-7b-v1_1',
-        model_kwargs=dict(
-            trust_remote_code=True,
-            device_map='auto',
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
-        batch_size=8,
-        meta_template=_meta_template,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='<eoa>',
-    )
-]
--- a/configs/models/hf_internlm/lmdeploy_internlm2_20b.py
+++ b/configs/models/hf_internlm/lmdeploy_internlm2_20b.py
@ -0,0 +1,27 @@
+from opencompass.models.turbomind import TurboMindModel
+
+
+models = [
+    dict(
+        type=TurboMindModel,
+        abbr="internlm2-20b-turbomind",
+        path="internlm/internlm2-20b",
+        engine_config=dict(
+            session_len=32768,
+            max_batch_size=32,
+            model_name="internlm2-20b",
+            tp=2,
+        ),
+        gen_config=dict(
+            top_k=1,
+            top_p=0.8,
+            temperature=1.0,
+            max_new_tokens=2000,
+        ),
+        max_out_len=2000,
+        max_seq_len=32768,
+        batch_size=32,
+        concurrency=8,
+        run_cfg=dict(num_gpus=2, num_procs=1),
+    )
+]
--- a/configs/models/hf_internlm/lmdeploy_internlm2_chat_20b.py
+++ b/configs/models/hf_internlm/lmdeploy_internlm2_chat_20b.py
@ -15,9 +15,8 @@ models = [
        path="internlm/internlm2-chat-20b",
        meta_template=_meta_template,
        engine_config=dict(
-            session_len=210000,
-            max_batch_size=8,
-            rope_scaling_factor=3.0,
+            session_len=32768,
+            max_batch_size=32,
            model_name="internlm2-chat-20b",
            tp=2,
            stop_words=[2, 92542],
@ -29,8 +28,8 @@ models = [
            max_new_tokens=2000,
        ),
        max_out_len=2000,
-        max_seq_len=210000,
-        batch_size=1,
+        max_seq_len=32768,
+        batch_size=32,
        concurrency=8,
        run_cfg=dict(num_gpus=2, num_procs=1),
    )
--- a/configs/models/hf_internlm/lmdeploy_internlm2_chat_7b.py
+++ b/configs/models/hf_internlm/lmdeploy_internlm2_chat_7b.py
@ -15,9 +15,8 @@ models = [
        path="internlm/internlm2-chat-7b",
        meta_template=_meta_template,
        engine_config=dict(
-            session_len=210000,
-            max_batch_size=8,
-            rope_scaling_factor=2.0,
+            session_len=32768,
+            max_batch_size=32,
            model_name="internlm2-chat-7b",
            tp=1,
            stop_words=[2, 92542],
@ -29,8 +28,8 @@ models = [
            max_new_tokens=2000,
        ),
        max_out_len=2000,
-        max_seq_len=210000,
-        batch_size=1,
+        max_seq_len=32768,
+        batch_size=32,
        concurrency=8,
        run_cfg=dict(num_gpus=1, num_procs=1),
    )
--- a/configs/models/hf_llama/hf_llama2_13b.py
+++ b/configs/models/hf_llama/hf_llama2_13b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-2-13b-hf',
-        path="meta-llama/Llama-2-13b-hf",
-        tokenizer_path='meta-llama/Llama-2-13b-hf',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-13b-hf',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama2_13b_chat.py
+++ b/configs/models/hf_llama/hf_llama2_13b_chat.py
@ -1,32 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='[INST] ', end=' [/INST]'),
-        dict(role="BOT", begin=' ', end=' ', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='llama-2-13b-chat-hf',
-        path="meta-llama/Llama-2-13b-chat-hf",
-        tokenizer_path='meta-llama/Llama-2-13b-chat-hf',
-        model_kwargs=dict(
-            device_map='auto'
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-13b-chat-hf',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=2, num_procs=1),
-        end_str='[INST]',
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama2_70b.py
+++ b/configs/models/hf_llama/hf_llama2_70b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-2-70b-hf',
-        path="meta-llama/Llama-2-70b-hf",
-        tokenizer_path='meta-llama/Llama-2-70b-hf',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-70b-hf',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/hf_llama/hf_llama2_70b_chat.py
+++ b/configs/models/hf_llama/hf_llama2_70b_chat.py
@ -1,32 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='[INST] ', end=' [/INST]'),
-        dict(role="BOT", begin=' ', end=' ', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='llama-2-70b-chat-hf',
-        path="meta-llama/Llama-2-70b-chat-hf",
-        tokenizer_path='meta-llama/Llama-2-70b-chat-hf',
-        model_kwargs=dict(
-            device_map='auto'
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-70b-chat-hf',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=4, num_procs=1),
-        end_str='[INST]',
-        batch_padding=True,
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/hf_llama/hf_llama2_7b.py
+++ b/configs/models/hf_llama/hf_llama2_7b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-2-7b-hf',
-        path="meta-llama/Llama-2-7b-hf",
-        tokenizer_path='meta-llama/Llama-2-7b-hf',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-7b-hf',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama2_7b_chat.py
+++ b/configs/models/hf_llama/hf_llama2_7b_chat.py
@ -1,32 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin='[INST] ', end=' [/INST]'),
-        dict(role="BOT", begin=' ', end=' ', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFacewithChatTemplate,
        abbr='llama-2-7b-chat-hf',
-        path="meta-llama/Llama-2-7b-chat-hf",
-        tokenizer_path='meta-llama/Llama-2-7b-chat-hf',
-        model_kwargs=dict(
-            device_map='auto'
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        path='meta-llama/Llama-2-7b-chat-hf',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        end_str='[INST]',
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama3_70b.py
+++ b/configs/models/hf_llama/hf_llama3_70b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
-        abbr="llama-3-70b-hf",
-        path="meta-llama/Meta-Llama-3-70B",
-        model_kwargs=dict(device_map="auto"),
-        tokenizer_kwargs=dict(
-            padding_side="left",
-            truncation_side="left",
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        type=HuggingFaceBaseModel,
+        abbr='llama-3-70b-hf',
+        path='meta-llama/Meta-Llama-3-70B',
+        max_out_len=1024,
        batch_size=8,
-        batch_padding=True,
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama3_70b_instruct.py
+++ b/configs/models/hf_llama/hf_llama3_70b_instruct.py
@ -1,29 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin="<|start_header_id|>user<|end_header_id|>\n\n", end="<|eot_id|>"),
-        dict(role="BOT", begin="<|start_header_id|>assistant<|end_header_id|>\n\n", end="<|eot_id|>", generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
-        abbr="llama-3-70b-instruct-hf",
-        path="meta-llama/Meta-Llama-3-70B-Instruct",
-        model_kwargs=dict(device_map="auto"),
-        tokenizer_kwargs=dict(
-            padding_side="left",
-            truncation_side="left",
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        type=HuggingFacewithChatTemplate,
+        abbr='llama-3-70b-instruct-hf',
+        path='meta-llama/Meta-Llama-3-70B-Instruct',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=4, num_procs=1),
-        generation_kwargs={"eos_token_id": [128001, 128009]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=4),
+        stop_words=['<|end_of_text|>', '<|eot_id|>'],
    )
 ]
--- a/configs/models/hf_llama/hf_llama3_8b.py
+++ b/configs/models/hf_llama/hf_llama3_8b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
+from opencompass.models import HuggingFaceBaseModel

 models = [
    dict(
-        type=HuggingFaceCausalLM,
-        abbr="llama-3-8b-hf",
-        path="meta-llama/Meta-Llama-3-8B",
-        model_kwargs=dict(device_map="auto"),
-        tokenizer_kwargs=dict(
-            padding_side="left",
-            truncation_side="left",
-            use_fast=False,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        type=HuggingFaceBaseModel,
+        abbr='llama-3-8b-hf',
+        path='meta-llama/Meta-Llama-3-8B',
+        max_out_len=1024,
        batch_size=8,
-        batch_padding=True,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama3_8b_instruct.py
+++ b/configs/models/hf_llama/hf_llama3_8b_instruct.py
@ -1,29 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
-
-_meta_template = dict(
-    round=[
-        dict(role="HUMAN", begin="<|start_header_id|>user<|end_header_id|>\n\n", end="<|eot_id|>"),
-        dict(role="BOT", begin="<|start_header_id|>assistant<|end_header_id|>\n\n", end="<|eot_id|>", generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
-        type=HuggingFaceCausalLM,
-        abbr="llama-3-8b-instruct-hf",
-        path="meta-llama/Meta-Llama-3-8B-Instruct",
-        model_kwargs=dict(device_map="auto"),
-        tokenizer_kwargs=dict(
-            padding_side="left",
-            truncation_side="left",
-            use_fast=False,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        type=HuggingFacewithChatTemplate,
+        abbr='llama-3-8b-instruct-hf',
+        path='meta-llama/Meta-Llama-3-8B-Instruct',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        generation_kwargs={"eos_token_id": [128001, 128009]},
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
+        stop_words=['<|end_of_text|>', '<|eot_id|>'],
    )
 ]
--- a/configs/models/hf_llama/hf_llama_13b.py
+++ b/configs/models/hf_llama/hf_llama_13b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # LLaMA 13B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-13b-hf',
-        path="huggyllama/llama-13b",
-        tokenizer_path='huggyllama/llama-13b',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='huggyllama/llama-13b',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=2, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/hf_llama_30b.py
+++ b/configs/models/hf_llama/hf_llama_30b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # LLaMA 30B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-30b-hf',
-        path="huggyllama/llama-30b",
-        tokenizer_path='huggyllama/llama-30b',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='huggyllama/llama-30b',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=2),
    )
 ]
--- a/configs/models/hf_llama/hf_llama_65b.py
+++ b/configs/models/hf_llama/hf_llama_65b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # LLaMA 65B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-65b-hf',
-        path="huggyllama/llama-65b",
-        tokenizer_path='huggyllama/llama-65b',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='huggyllama/llama-65b',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=4, num_procs=1),
+        run_cfg=dict(num_gpus=4),
    )
 ]
--- a/configs/models/hf_llama/hf_llama_7b.py
+++ b/configs/models/hf_llama/hf_llama_7b.py
@ -1,21 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel

 models = [
-    # LLaMA 7B
    dict(
-        type=HuggingFaceCausalLM,
+        type=HuggingFaceBaseModel,
        abbr='llama-7b-hf',
-        path="huggyllama/llama-7b",
-        tokenizer_path='huggyllama/llama-7b',
-        tokenizer_kwargs=dict(padding_side='left',
-                              truncation_side='left',
-                              use_fast=False,
-                              ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='huggyllama/llama-7b',
+        max_out_len=1024,
        batch_size=8,
-        model_kwargs=dict(device_map='auto'),
-        batch_padding=False, # if false, inference with for-loop without batch padding
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/hf_llama/lmdeploy_llama3_70b_instruct.py
+++ b/configs/models/hf_llama/lmdeploy_llama3_70b_instruct.py
@ -0,0 +1,24 @@
+from opencompass.models import TurboMindModel
+
+_meta_template = dict(
+    round=[
+        dict(role="HUMAN", begin='<|begin_of_text|>user<|end_header_id|>\n\n', end='<|eot_id|>'),
+        dict(role="BOT", begin='<|begin_of_text|>assistant<|end_header_id|>\n\n', end='<|eot_id|>', generate=True),
+    ],
+)
+
+models = [
+    dict(
+        type=TurboMindModel,
+        abbr='llama-3-70b-instruct-lmdeploy',
+        path='meta-llama/Meta-Llama-3-70B-Instruct',
+        engine_config=dict(session_len=4096, max_batch_size=16, tp=4),
+        gen_config=dict(top_k=1, temperature=1, top_p=0.9, max_new_tokens=1024, stop_words=[128001, 128009]),
+        max_out_len=1024,
+        max_seq_len=4096,
+        batch_size=16,
+        concurrency=16,
+        meta_template=_meta_template,
+        run_cfg=dict(num_gpus=4),
+    )
+]
--- a/configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py
+++ b/configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py
@ -0,0 +1,24 @@
+from opencompass.models import TurboMindModel
+
+_meta_template = dict(
+    round=[
+        dict(role="HUMAN", begin='<|begin_of_text|>user<|end_header_id|>\n\n', end='<|eot_id|>'),
+        dict(role="BOT", begin='<|begin_of_text|>assistant<|end_header_id|>\n\n', end='<|eot_id|>', generate=True),
+    ],
+)
+
+models = [
+    dict(
+        type=TurboMindModel,
+        abbr='llama-3-8b-instruct-lmdeploy',
+        path='meta-llama/Meta-Llama-3-8B-Instruct',
+        engine_config=dict(session_len=4096, max_batch_size=16, tp=1),
+        gen_config=dict(top_k=1, temperature=1, top_p=0.9, max_new_tokens=1024, stop_words=[128001, 128009]),
+        max_out_len=1024,
+        max_seq_len=4096,
+        batch_size=16,
+        concurrency=16,
+        meta_template=_meta_template,
+        run_cfg=dict(num_gpus=1),
+    )
+]
--- a/configs/models/mistral/hf_mistral_7b_instruct_v0_1.py
+++ b/configs/models/mistral/hf_mistral_7b_instruct_v0_1.py
@ -1,34 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    begin="<s>",
-    round=[
-        dict(role="HUMAN", begin='[INST] ', end=' [/INST]'),
-        dict(role="BOT", begin="", end='</s> ', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
+        type=HuggingFacewithChatTemplate,
        abbr='mistral-7b-instruct-v0.1-hf',
-        type=HuggingFaceCausalLM,
        path='mistralai/Mistral-7B-Instruct-v0.1',
-        tokenizer_path='mistralai/Mistral-7B-Instruct-v0.1',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/mistral/hf_mistral_7b_instruct_v0_2.py
+++ b/configs/models/mistral/hf_mistral_7b_instruct_v0_2.py
@ -1,34 +1,12 @@
-from opencompass.models import HuggingFaceCausalLM
-
-
-_meta_template = dict(
-    begin="<s>",
-    round=[
-        dict(role="HUMAN", begin='[INST] ', end=' [/INST]'),
-        dict(role="BOT", begin="", end='</s> ', generate=True),
-    ],
-)
+from opencompass.models import HuggingFacewithChatTemplate

 models = [
    dict(
+        type=HuggingFacewithChatTemplate,
        abbr='mistral-7b-instruct-v0.2-hf',
-        type=HuggingFaceCausalLM,
        path='mistralai/Mistral-7B-Instruct-v0.2',
-        tokenizer_path='mistralai/Mistral-7B-Instruct-v0.2',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        meta_template=_meta_template,
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
-        batch_padding=True,
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/mistral/hf_mistral_7b_v0_1.py
+++ b/configs/models/mistral/hf_mistral_7b_v0_1.py
@ -1,24 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
+        type=HuggingFaceBaseModel,
        abbr='mistral-7b-v0.1-hf',
-        type=HuggingFaceCausalLM,
        path='mistralai/Mistral-7B-v0.1',
-        tokenizer_path='mistralai/Mistral-7B-v0.1',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/mistral/hf_mistral_7b_v0_2.py
+++ b/configs/models/mistral/hf_mistral_7b_v0_2.py
@ -1,23 +1,13 @@
-from opencompass.models import HuggingFaceCausalLM
+from opencompass.models import HuggingFaceBaseModel


 models = [
    dict(
+        type=HuggingFaceBaseModel,
        abbr='mistral-7b-v0.2-hf',
-        type=HuggingFaceCausalLM,
-        path='alpindale/Mistral-7B-v0.2-hf',
-        model_kwargs=dict(
-            device_map='auto',
-            trust_remote_code=True,
-        ),
-        tokenizer_kwargs=dict(
-            padding_side='left',
-            truncation_side='left',
-            trust_remote_code=True,
-        ),
-        max_out_len=100,
-        max_seq_len=2048,
+        path='mistral-community/Mistral-7B-v0.2',
+        max_out_len=1024,
        batch_size=8,
-        run_cfg=dict(num_gpus=1, num_procs=1),
+        run_cfg=dict(num_gpus=1),
    )
 ]
--- a/configs/models/mistral/hf_mixtral_8x22b_instruct_v0_1.py
+++ b/configs/models/mistral/hf_mixtral_8x22b_instruct_v0_1.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='mixtral-8x22b-instruct-v0.1-hf',
+        path='mistralai/Mixtral-8x22B-Instruct-v0.1',
+        max_out_len=1024,
+        batch_size=4,
+        run_cfg=dict(num_gpus=8),
+    )
+]
--- a/configs/models/mistral/hf_mixtral_8x22b_v0_1.py
+++ b/configs/models/mistral/hf_mixtral_8x22b_v0_1.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFaceBaseModel
+
+models = [
+    dict(
+        type=HuggingFaceBaseModel,
+        abbr='mixtral-8x22b-v0.1-hf',
+        path='mistralai/Mixtral-8x22B-v0.1',
+        max_out_len=1024,
+        batch_size=4,
+        run_cfg=dict(num_gpus=8),
+    )
+]
--- a/configs/models/mistral/hf_mixtral_8x7b_instruct_v0_1.py
+++ b/configs/models/mistral/hf_mixtral_8x7b_instruct_v0_1.py
@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='mixtral-8x7b-instruct-v0.1-hf',
+        path='mistralai/Mixtral-8x7B-Instruct-v0.1',
+        max_out_len=1024,
+        batch_size=8,
+        run_cfg=dict(num_gpus=4),
+    )
+]
--- a/Show More
+++ b/Show More