Commit Graph

  • 55aa1717ce [Refactor] Refactorize openicl eval task MaiziXiao 2025-04-01 03:49:04 +0000
  • f71eb78c72
    [Doc] Add TBD Token in Datasets Statistics (#1986) Myhs_phz 2025-03-31 19:08:55 +0800
  • 0d69047a8b doc Myhs-phz 2025-03-31 07:33:37 +0000
  • 8516612701 doc Myhs-phz 2025-03-31 07:33:01 +0000
  • 2b8eeb261d doc Myhs-phz 2025-03-31 07:30:39 +0000
  • a194c01b23 doc Myhs-phz 2025-03-31 02:43:21 +0000
  • 030c6794ed updates the dependency PyExt ShaohonChen 2025-03-29 19:58:23 +0800
  • 55d6941ed9 Add olymmath dataset liushz 2025-03-28 11:46:29 +0000
  • 31f259e909 feat Myhs-phz 2025-03-28 11:46:00 +0000
  • 006a70748f Add olymmath dataset liushz 2025-03-28 11:25:52 +0000
  • a75b8fa98e Add olymmath dataset liushz 2025-03-28 11:21:35 +0000
  • 55cc6cfb54 intervl-8b&38b Dongsheng Zhu 2025-03-27 09:39:31 +0000
  • 0f46c35211
    [Bug] Aime2024 config fix (#1974) Linchen Xiao 2025-03-25 17:57:11 +0800
  • ad22275fb1 fix MaiziXiao 2025-03-25 09:49:18 +0000
  • b72544bc55 [Bug] Aime2024 config fix MaiziXiao 2025-03-25 09:22:46 +0000
  • 6118596362
    [Feature] Add recommendation configs for datasets (#1937) Myhs_phz 2025-03-25 14:54:13 +0800
  • 6d443cc7fe
    Update dataset-index.yml Myhs_phz 2025-03-25 14:31:08 +0800
  • 566d913a01
    Merge branch 'main' into datasetrefine_week1 Myhs_phz 2025-03-25 14:22:54 +0800
  • 3bcbdc4077 fix Myhs-phz 2025-03-25 03:43:59 +0000
  • 5cbd8b7324 Fix torch dtype error liushz 2025-03-25 02:54:26 +0000
  • 871ebaf969 fix Myhs-phz 2025-03-25 02:48:46 +0000
  • 21a92d4c14 Merge branch 'main' of github.com:open-compass/opencompass into tmp_olmpbench liushz 2025-03-25 02:48:09 +0000
  • bf436cbae6 doc Myhs-phz 2025-03-24 14:00:24 +0000
  • c5abd9201a fix Myhs-phz 2025-03-24 11:24:28 +0000
  • 8e1a41c532 fix Myhs-phz 2025-03-24 11:20:46 +0000
  • 21cbb2691d fix Myhs-phz 2025-03-24 11:17:05 +0000
  • 7cfe1cfe04 fix Myhs-phz 2025-03-24 11:06:03 +0000
  • 4261b316a6 fix Myhs-phz 2025-03-24 10:40:58 +0000
  • 07930b854a
    [Update] Add Korbench config with no max_out_len (#1968) Linchen Xiao 2025-03-24 18:38:06 +0800
  • 6a51f41c31 fix Myhs-phz 2025-03-24 10:34:41 +0000
  • b992e70fb4 Add Korbench no max_out_len MaiziXiao 2025-03-24 10:33:33 +0000
  • bd23094df4 Add Korbench no max_out_len MaiziXiao 2025-03-24 10:30:06 +0000
  • 796e2d0177 fix Myhs-phz 2025-03-24 10:29:32 +0000
  • 199052e92e fix Myhs-phz 2025-03-24 10:15:34 +0000
  • 4f07a1c457 fix Myhs-phz 2025-03-24 08:56:04 +0000
  • 2639d113d8 fix Myhs-phz 2025-03-24 08:27:45 +0000
  • 37307fa996
    [Update] Add QWQ32b model config (#1959) Myhs_phz 2025-03-24 14:51:39 +0800
  • db96161a4e
    [Update] Add SuperGPQA subset metrics (#1966) Linchen Xiao 2025-03-24 14:25:12 +0800
  • aa05993922
    [Update] Add dataset configurations of no max_out_len (#1967) Linchen Xiao 2025-03-24 14:24:12 +0800
  • 5134f14e72 update test torch version MaiziXiao 2025-03-24 06:05:40 +0000
  • 8f6ce4db52 update test torch version MaiziXiao 2025-03-24 05:55:02 +0000
  • 3e50ee9533 update test torch version MaiziXiao 2025-03-24 05:49:28 +0000
  • 0dcb1bd0fe update test torch version MaiziXiao 2025-03-24 04:20:34 +0000
  • d8b056cd77
    Merge branch 'main' into qwq32b Linchen Xiao 2025-03-24 11:30:28 +0800
  • 3451c8190c [Update] Add dataset configurations of no max_out_len MaiziXiao 2025-03-24 03:29:41 +0000
  • 64128916d0
    [Update] Increase memory size for CPU job of VOLC Runner (#1962) Linchen Xiao 2025-03-24 11:21:14 +0800
  • 8a5029b121
    [Feature] Add MultiPL-E & Code Evaluator (#1963) Dongsheng Zhu 2025-03-21 20:09:25 +0800
  • 03531e7a2f SuperGPQA subset metrics MaiziXiao 2025-03-21 12:06:18 +0000
  • 3f6435e0e3 index upadate Dongsheng Zhu 2025-03-21 05:13:20 +0000
  • 492bf320af comments upadate Dongsheng Zhu 2025-03-21 03:49:54 +0000
  • f92947c900 multiple_code update Dongsheng Zhu 2025-03-20 06:02:13 +0000
  • 373a0cba9b multiple_code develop Dongsheng Zhu 2025-03-20 05:56:27 +0000
  • 8a425ecd9a Merge branch 'main' of github.com:open-compass/opencompass into tmp_olmpbench liushz 2025-03-20 03:40:32 +0000
  • b1c2f60d32 fix hook Myhs-phz 2025-03-19 09:14:52 +0000
  • 6b4772df3a [Update] Increase memory size for CPU job of VOLC Runner MaiziXiao 2025-03-19 08:42:23 +0000
  • 26c5f64946 [Update] Increase memory size for CPU job of VOLC Runner MaiziXiao 2025-03-19 08:42:12 +0000
  • ffe00a830d feat Myhs-phz 2025-03-19 03:37:23 +0000
  • 716c02785c fix and doc Myhs-phz 2025-03-19 02:03:45 +0000
  • cc9761e882 fix Myhs-phz 2025-03-19 01:16:59 +0000
  • b9de8b0e2b
    [Update] Unset disallowed_special token for Openai model (#1960) Linchen Xiao 2025-03-18 20:24:07 +0800
  • c98599271b
    [Update] Update OlympiadBench and Update LLM Judge (#1954) Songyang Zhang 2025-03-18 20:15:20 +0800
  • 5d2d253d83
    [BUG] Fix model_kwargs pass logic for vllm (#1958) Jason Cheung 2025-03-18 20:08:15 +0800
  • bfcdde3df4 [Update] Unset disallowed_special token for Openai model MaiziXiao 2025-03-18 11:40:58 +0000
  • f7b28e87f3 fix the bug that model_kwargs passed in is invalid when the accelerator is vllm ZJJ 2025-03-18 17:41:30 +0800
  • b9b69febc3 back Myhs-phz 2025-03-18 09:33:06 +0000
  • 1b8e520a12 Update OlympiadBench and Update LLM Judge zhangsongyang 2025-03-17 12:48:34 +0000
  • 0b7f76e193
    [Bug] Fix Summarizer logic (#1953) Linchen Xiao 2025-03-17 18:25:08 +0800
  • 43cf21581a [Bug] Fix Summarizer logic MaiziXiao 2025-03-17 09:57:56 +0000
  • 53c6725d19 fix Myhs-phz 2025-03-17 09:40:57 +0000
  • 15c825a51a
    [Update] Bbeh harmony summarizer added (#1951) Yufeng Zhao 2025-03-17 17:19:56 +0800
  • cd2678dddf cleaned_rebased yufeng zhao 2025-03-17 09:13:00 +0000
  • a04ef82977 clean yufeng zhao 2025-03-17 08:59:00 +0000
  • 4bdaab0c5f clean yufeng zhao 2025-03-17 08:57:35 +0000
  • 674d67cfe7 harmonic-tested yufeng zhao 2025-03-17 08:47:51 +0000
  • 8f904eb24a harmonic-tested yufeng zhao 2025-03-17 08:46:44 +0000
  • b6fe20b20e update_summerizer yufeng zhao 2025-03-17 05:22:44 +0000
  • 4941df996f harmonic yufeng zhao 2025-03-16 03:25:32 +0000
  • aa788a60eb removeprint yufeng zhao 2025-03-10 06:47:11 +0000
  • 5fc26c92b2 fix_smallbugs_bbeh yufeng zhao 2025-03-10 04:54:28 +0000
  • 28ad5a98d4 bbeh yufeng zhao 2025-03-10 04:25:52 +0000
  • 2e7ecad7d1 bbeh yufeng zhao 2025-03-10 04:24:52 +0000
  • d7f2f3ec6b clean yufeng zhao 2025-03-17 08:59:00 +0000
  • e22a2b3342 clean yufeng zhao 2025-03-17 08:57:35 +0000
  • 04f9fb614d harmonic-tested yufeng zhao 2025-03-17 08:47:51 +0000
  • 171b28b38b harmonic-tested yufeng zhao 2025-03-17 08:46:44 +0000
  • f9599c1f32 update_summerizer yufeng zhao 2025-03-17 05:22:44 +0000
  • 78d94e7bbd harmonic yufeng zhao 2025-03-16 03:25:32 +0000
  • 854c6bf025
    [Update] Update requirement and base evaluator Linchen Xiao 2025-03-13 20:52:50 +0800
  • 8d910aad84 update MaiziXiao 2025-03-13 11:50:28 +0000
  • 7c04f1a016 update MaiziXiao 2025-03-13 11:20:59 +0000
  • e6b6d16823 update MaiziXiao 2025-03-13 10:43:28 +0000
  • 1c60e3a0f6
    [Update] Add configurations for llmjudge dataset (#1940) Linchen Xiao 2025-03-13 17:30:04 +0800
  • df4ef43534 update MaiziXiao 2025-03-13 09:20:43 +0000
  • a3bac3611a [Fix] Fix bugs when adding QwQ models wangjingchao 2025-03-13 17:16:16 +0800
  • 4ef3c5083b [Feature] Support QwQ-32B and QwQ-Plus Hoter Young 2025-03-13 17:07:15 +0800
  • 51f5792f7c fix Myhs-phz 2025-03-13 08:54:30 +0000
  • 03edec84db Add configurations for llmjudge dataset MaiziXiao 2025-03-13 08:37:32 +0000
  • a80dcd4df7 add_qwen_api_qwq_32b wangjingchao 2025-03-13 14:02:36 +0800
  • 60b230e285 fix datasets in fullbench_int3 Myhs-phz 2025-03-12 13:58:58 +0000
  • 709bc4af0e
    [Update] Add AIME2025 oss info (#1936) liushz 2025-03-12 18:41:16 +0800