mirror of
https://github.com/open-compass/opencompass.git
synced 2025-05-30 16:03:24 +08:00

* update requirement * update requirement * update with minimax * update api model * Update readme * fix error --------- Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2.4 KiB
2.4 KiB
News
- [2023.08.25] TigerBot team adpots OpenCompass to evaluate their models systematically. We deeply appreciate the community's dedication to transparency and reproducibility in LLM evaluation.
- [2023.08.21] Lagent has been released, which is a lightweight framework for building LLM-based agents. We are working with Lagent team to support the evaluation of general tool-use capability, stay tuned!
- [2023.08.18] We have supported evaluation for multi-modality learning, include MMBench, SEED-Bench, COCO-Caption, Flickr-30K, OCR-VQA, ScienceQA and so on. Leaderboard is on the road. Feel free to try multi-modality evaluation with OpenCompass !
- [2023.08.18] Dataset card is now online. Welcome new evaluation benchmark OpenCompass !
- [2023.08.11] Model comparison is now online. We hope this feature offers deeper insights!
- [2023.08.11] We have supported LEval.
- [2023.08.10] OpenCompass is compatible with LMDeploy. Now you can follow this instruction to evaluate the accelerated models provide by the Turbomind.
- [2023.08.10] We have supported Qwen-7B and XVERSE-13B ! Go to our leaderboard for more results! More models are welcome to join OpenCompass.
- [2023.08.09] Several new datasets(CMMLU, TydiQA, SQuAD2.0, DROP) are updated on our leaderboard! More datasets are welcomed to join OpenCompass.
- [2023.08.07] We have added a script for users to evaluate the inference results of MMBench-dev.
- [2023.08.05] We have supported GPT-4! Go to our leaderboard for more results! More models are welcome to join OpenCompass.
- [2023.07.27] We have supported CMMLU! More datasets are welcome to join OpenCompass.