OpenCompass/configs/datasets/MMLUArabic
2024-05-14 15:35:58 +08:00
..
MMLUArabic_gen_326684.py [Format] Add config lints (#892) 2024-05-14 15:35:58 +08:00
MMLUArabic_gen.py [Feature] Add AceGPT-MMLUArabic benchmark (#1099) 2024-05-08 15:00:26 +08:00
MMLUArabic_ppl_d2333a.py [Format] Add config lints (#892) 2024-05-14 15:35:58 +08:00
MMLUArabic_ppl.py [Feature] Add AceGPT-MMLUArabic benchmark (#1099) 2024-05-08 15:00:26 +08:00
MMLUArabic_zero_shot_gen_3523e0.py [Format] Add config lints (#892) 2024-05-14 15:35:58 +08:00
MMLUArabic_zero_shot_gen.py [Feature] Add AceGPT-MMLUArabic benchmark (#1099) 2024-05-08 15:00:26 +08:00
README.md [Format] Add config lints (#892) 2024-05-14 15:35:58 +08:00

MMLUArabic

Dataset Description

MMLUArabic is a benchmark for the assessment of knowledge in Arabic and covers a wide range of topics and aspects, consisting of multiple-choice questions in various branches of knowledge.

How to Use

Download file from link

val_ds = load_dataset("MMLUArabic", header=None)['validation']
test_ds = load_dataset("MMLUArabic", header=None)['test']
# input, option_a, option_b, option_c, option_d, target
print(next(iter(val_ds)))

Citation

@misc{huang2023acegpt,
      title={AceGPT, Localizing Large Language Models in Arabic},
      author={Huang Huang and Fei Yu and Jianqing Zhu and Xuening Sun and Hao Cheng and Dingjie Song and Zhihong Chen and Abdulmohsen Alharthi and Bang An and Ziche Liu and Zhiyi Zhang and Junying Chen and Jianquan Li and Benyou Wang and Lian Zhang and Ruoyu Sun and Xiang Wan and Haizhou Li and Jinchao Xu},
      year={2023},
      eprint={2309.12053},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}