OpenCompass/configs
Chang Lan a927bba1cf
[Fix] Fix RULER datasets (#1628)
We need to ensure that we don't import anything that ends with "_datasets",
or they will be picked up by the runner, leading to duplicate / unwanted datasets
being evaluated.
2024-10-22 11:59:02 +08:00
..
api_examples [Feature] Update CoreBench 2.0 (#1566) 2024-09-26 18:44:00 +08:00
dataset_collections [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
datasets [Fix] Fix RULER datasets (#1628) 2024-10-22 11:59:02 +08:00
models [Feature] Support LiveCodeBench (#1617) 2024-10-21 20:50:39 +08:00
summarizers [BUG] Update CIbench config(#1544) 2024-09-23 18:32:27 +08:00
eval_academic_leaderboard_202407.py [Feature] Update Lint and Leaderboard (#1458) 2024-08-28 22:36:42 +08:00
eval_alaya.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_api_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_attack.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_base_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_bluelm_32k_lveval.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_charm_mem.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_charm_rea.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_chat_agent_baseline.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_chat_agent.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_chat_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_chat_last.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_chembench.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_cibench_api.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_cibench.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_circular.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_claude.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_code_passk_repeat_dataset.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_code_passk.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_codeagent.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_codegeex2.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_contamination.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_corebench_2409_base_objective.py [Feature] Update BailingLM/OpenAI verbose (#1568) 2024-09-27 11:15:25 +08:00
eval_corebench_2409_chat_objective.py [Feature] Update CoreBench 2.0 (#1566) 2024-09-26 18:44:00 +08:00
eval_corebench_2409_longcontext.py [Feature] Add Config for CoreBench (#1547) 2024-09-25 11:36:43 +08:00
eval_corebench_2409_subjective.py [Feature] Add Config for CoreBench (#1547) 2024-09-25 11:36:43 +08:00
eval_dingo.py [Feature] Add dingo test (#1529) 2024-09-29 19:24:58 +08:00
eval_ds1000_interpreter.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_edgellm_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_gpt3.5.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_gpt4.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_hellobench.py Upload HelloBench (#1607) 2024-10-15 17:11:37 +08:00
eval_hf_llama2.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_hf_llama_7b.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_inference_ppl.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm2_chat_keyset.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm2_keyset.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_7b.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_chat_lmdeploy_apiserver.py [Feature] Add an attribute api_key into TurboMindAPIModel default None (#1475) 2024-09-05 17:51:16 +08:00
eval_internlm_chat_turbomind.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_flames_chat.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_lmdeploy_apiserver.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_math_chat.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internlm_turbomind.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_internLM.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_judgerbench.py [Update] eval_judgerbench.py (#1625) 2024-10-21 15:30:29 +08:00
eval_lightllm.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_llama2_7b_lveval.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_llama2_7b.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_llama3_instruct.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_llm_compression.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_lmdeploy_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_math_llm_judge.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_mathbench.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_mmlu_pro.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_mmlu_with_zero_retriever_overwritten.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_modelscope_datasets.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_multi_prompt_demo.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_needlebench.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_qwen_7b_chat_lawbench.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_qwen_7b_chat.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_qwen_7b.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_ruler_fix_tokenizer.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_ruler.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_rwkv5_3b.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_subjective_alpacaeval_official.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_subjective.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_teval.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_TheoremQA.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00
eval_with_model_dataset_combinations.py [Doc] Update Readme (#1439) 2024-08-22 14:48:45 +08:00