Commit Graph

  • 8e6d2ab7e6 feat(datasets): add MaritimeBench dataset and related configuration zhanghaoyu 2025-04-14 09:51:43 +0800
  • 8c5b04947f Merge branch 'main' of https://github.com/domonic18/opencompass Deadwalk 2025-04-14 10:09:39 +0800
  • 3a80592817 [Dataset]update GAIA prompt Deadwalk 2025-04-14 10:09:27 +0800
  • 6ce8643cfc
    Merge branch 'open-compass:main' into main Deadwalk 2025-04-14 08:13:48 +0800
  • a3fa2fb105
    Merge b2c84058f2 into 6a6a1a5c0b Dongsheng Zhu 2025-04-12 14:15:23 +0800
  • 6d809bf36d [Dataset]Added local configuration to GAIA dataset Deadwalk 2025-04-12 13:45:54 +0800
  • 6a6a1a5c0b
    [Feature] LLM Judge sanity check (#2012) Linchen Xiao 2025-04-11 19:01:39 +0800
  • 4cac97ca94 update MaiziXiao 2025-04-11 09:56:51 +0000
  • 3f50b1dc49
    [Fix] fix order bug Update arena_hard.py (#2015) bittersweet1999 2025-04-11 16:59:40 +0800
  • 032669c97a feat PHYSICS Myhs-phz 2025-04-11 03:57:30 +0000
  • 20660ab507
    [Fix] Fix compare error when k is list in base_evaluator (#2010) Junnan Liu 2025-04-10 19:47:21 +0800
  • 390e33e51b feat ClimateQA Myhs-phz 2025-04-10 09:18:11 +0000
  • f764ee33a2 fix(datasets_info): update hf_id field for maritimebench zhanghaoyu 2025-04-10 16:10:04 +0800
  • a0bac8e604 feat(text_postprocessors): add parse_bracketed_answer function to extract answers enclosed in brackets zhanghaoyu 2025-04-10 16:04:42 +0800
  • 28b514cd3a fix compare error in 177 jnanliu 2025-04-10 07:03:53 +0000
  • 0394147f7d
    [fix] fix order bug Update arena_hard.py bittersweet1999 2025-04-10 14:41:28 +0800
  • b0bbd4a96a chore: fix line endings and formatting; add maritime_bench dataset zhanghaoyu 2025-04-10 13:40:28 +0800
  • 72b7caa575 Merge branch 'main' of https://github.com/domonic18/opencompass Deadwalk 2025-04-10 11:11:35 +0800
  • 7b626eff9e [Dataset] Add GAIA Deadwalk 2025-04-10 11:04:38 +0800
  • 975e4bcadf
    Merge branch 'open-compass:main' into main bittersweet1999 2025-04-09 16:26:07 +0800
  • 12213207b6
    [Refactor] Refactorize openicl eval task (#1990) Linchen Xiao 2025-04-09 15:52:23 +0800
  • db716d7dfc update MaiziXiao 2025-04-09 07:13:37 +0000
  • 6ac9b06bc2
    [ci] update baseline for kernal change of vllm and lmdeploy (#2011) zhulinJulia24 2025-04-09 14:09:35 +0800
  • ca43b9d4e9 update zhulinJulia24 2025-04-09 11:02:18 +0800
  • 403cbd9c6e update zhulinJulia24 2025-04-09 10:58:34 +0800
  • 7415a57ace update zhulinJulia24 2025-04-09 10:56:12 +0800
  • 2d21617fe8 update zhulinJulia24 2025-04-09 10:49:37 +0800
  • 019c61e9f5 update zhulinJulia24 2025-04-09 10:42:47 +0800
  • b3efc3f2df update zhulinJulia24 2025-04-08 20:53:19 +0800
  • ae8354cb8a update zhulinJulia24 2025-04-08 20:22:11 +0800
  • b63f998459 fix gpass compare error of list k jnanliu 2025-04-08 10:16:24 +0000
  • a05f9da134
    [Feature] Make dump-eval-details default behavior (#1999) Linchen Xiao 2025-04-08 14:42:26 +0800
  • fd82bea747
    [Fix] OpenICL Math Evaluator Config (#2007) Myhs_phz 2025-04-08 14:38:35 +0800
  • bb58cfc85d
    [Feature] Add CascadeEvaluator (#1992) Linchen Xiao 2025-04-08 11:58:14 +0800
  • a4fa0f7435 updat MaiziXiao 2025-04-08 03:34:45 +0000
  • 25d8c6c6d7 update MaiziXiao 2025-04-08 03:27:22 +0000
  • b564e608b1
    [Dataset] Add MedXpertQA (#2002) Jin Ye 2025-04-08 12:44:48 +1000
  • 1c72b95b9b update MaiziXiao 2025-04-08 02:44:05 +0000
  • 4b187eb066 Fix lint MaiziXiao 2025-04-08 02:26:23 +0000
  • 34e11d136f Add MedXpertQA Yejin0111 2025-04-07 10:36:32 +0000
  • 828fb745c9
    [Dataset] Update dingo 1.5.0 (#2008) shijinpjlab 2025-04-07 17:21:15 +0800
  • 3bc6b5d3c8 fix Myhs-phz 2025-04-07 09:00:21 +0000
  • 8c8429b069 fix Myhs-phz 2025-04-07 08:59:31 +0000
  • a23f95e47b fix Myhs-phz 2025-04-07 08:56:57 +0000
  • 802d365a7b fix Myhs-phz 2025-04-07 08:53:23 +0000
  • 52df98014e fix recommended Myhs-phz 2025-04-07 08:52:07 +0000
  • 29b18d527b update dingo 1.5.0 shiin 2025-04-07 15:28:02 +0800
  • a84d8d693e Add MedXpertQA Yejin0111 2025-04-07 04:06:02 +0000
  • 01b0ee6ee6 fix Myhs-phz 2025-04-07 02:36:51 +0000
  • b2c84058f2 examples update Dongsheng Zhu 2025-04-06 04:25:07 +0000
  • f568550633 code update Dongsheng Zhu 2025-04-05 04:44:42 +0000
  • d251524efc Merge remote-tracking branch 'origin' into code_update Dongsheng Zhu 2025-04-05 02:29:09 +0000
  • 659b00acd0 Add MedXpertQA Yejin0111 2025-04-03 13:44:46 +0000
  • a997e6532f Merge remote-tracking branch 'upstream/main' into openicl_eval_refactorize MaiziXiao 2025-04-03 11:54:34 +0000
  • f982d6278e
    [CI] fix baseline score (#2000) zhulinJulia24 2025-04-03 19:32:36 +0800
  • 3a9a384173
    [Doc] Fix links between zh & en (#2001) Myhs_phz 2025-04-03 17:37:53 +0800
  • 8fa31f184c update zhulinJulia24 2025-04-03 17:35:06 +0800
  • 498d9b08d1 test Myhs-phz 2025-04-03 09:34:48 +0000
  • 5df0e41823 test Myhs-phz 2025-04-03 09:31:54 +0000
  • 63b0b26805 test Myhs-phz 2025-04-03 09:20:59 +0000
  • b9461b1d37 update zhulinJulia24 2025-04-03 16:55:01 +0800
  • 035a7cee0e update zhulinJulia24 2025-04-03 16:32:14 +0800
  • 9b489e9ea0
    [Update] Revert math500 dataset configs (#1998) Myhs_phz 2025-04-03 15:11:02 +0800
  • 141b0d08c1 Update MaiziXiao 2025-04-03 07:10:17 +0000
  • 9d63fdd616 update zhulinJulia24 2025-04-03 15:06:01 +0800
  • d37527f54b fix Myhs-phz 2025-04-03 06:55:58 +0000
  • 7157b8911d update zhulinJulia24 2025-04-03 14:38:59 +0800
  • ba99868c77 update zhulinJulia24 2025-04-03 14:32:49 +0800
  • e3c2521df5 update zhulinJulia24 2025-04-03 13:47:45 +0800
  • a668f19555 update MaiziXiao 2025-04-03 04:08:27 +0000
  • 7b2bee4292 update zhulinJulia24 2025-04-03 10:17:56 +0800
  • 0585f9dad2 updaste zhulinJulia24 2025-04-03 10:14:29 +0800
  • b87052718e updaste zhulinJulia24 2025-04-03 10:06:04 +0800
  • c2cc5f7054 update zhulinJulia24 2025-04-03 09:57:01 +0800
  • 69082bafb8 update zhulinJulia24 2025-04-03 09:35:07 +0800
  • e8394a02f6
    Merge branch 'open-compass:main' into fix_baseline_score_new zhulinJulia24 2025-04-03 09:32:29 +0800
  • 7355696492 updaste zhulinJulia24 2025-04-03 09:27:26 +0800
  • 7067211d62 update zhulinJulia24 2025-04-03 09:08:25 +0800
  • 60f8b4cc97 update zhulinJulia24 2025-04-02 20:51:53 +0800
  • b40adfbffe update zhulinJulia24 2025-04-02 20:48:11 +0800
  • 14a74d71bd update zhulinJulia24 2025-04-02 20:26:24 +0800
  • e263f3df8d update zhulinJulia24 2025-04-02 20:10:33 +0800
  • 780bc1dd1e update zhulinJulia24 2025-04-02 17:50:19 +0800
  • dc8deb6af0
    [BUMP] Bump version to 0.4.2 (#1997) 0.4.2 Linchen Xiao 2025-04-02 17:47:15 +0800
  • 32d6859679
    [Feature] Add olymmath dataset (#1982) liushz 2025-04-02 17:34:07 +0800
  • cc87d05f83 Update olymmath dataset liushz 2025-04-02 09:23:46 +0000
  • 4601839ba8 Merge branch 'main' of github.com:open-compass/opencompass into olymmath liushz 2025-04-02 09:17:52 +0000
  • 0720e1c440 [BUMP] Bump version to 0.4.2 MaiziXiao 2025-04-02 08:54:37 +0000
  • f8a60d36f4 update zhulinJulia24 2025-04-02 16:11:39 +0800
  • 97236c8e97
    [CI] Fix baseline score (#1996) zhulinJulia24 2025-04-02 14:25:16 +0800
  • 2f4dbfd1fe update zhulinJulia24 2025-04-02 13:48:46 +0800
  • f66b0b347a
    [Update] Requirements update (#1993) Linchen Xiao 2025-04-02 12:03:45 +0800
  • e64ca780fa update MaiziXiao 2025-04-02 03:35:25 +0000
  • cf6084bb77 [Update] Minor updates MaiziXiao 2025-04-01 11:24:02 +0000
  • 6f36b1a76c [Feature] Add CascadeEvaluator MaiziXiao 2025-04-01 10:41:03 +0000
  • ca8c5658e6 intervl adjustment Dongsheng Zhu 2025-04-01 06:32:23 +0000
  • a603625f19 update zhulinJulia24 2025-04-01 13:30:45 +0800
  • ac70025cde update zhulinJulia24 2025-04-01 13:23:40 +0800
  • b42b83ac60 update zhulinJulia24 2025-04-01 13:05:24 +0800
  • 330a6e5ca7
    [Update] Add Intervl-8b&38b model configs (#1978) Dongsheng Zhu 2025-04-01 11:51:37 +0800