Songyang Zhang
|
aa2b89b6f8
|
[Update] Add CascadeEvaluator with Data Replica (#2022)
* Update CascadeEvaluator
* Update CascadeEvaluator
* Update CascadeEvaluator
* Update Config
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
|
2025-05-20 16:46:55 +08:00 |
|
Linchen Xiao
|
a05f9da134
|
[Feature] Make dump-eval-details default behavior (#1999)
* Update
* update
* update
|
2025-04-08 14:42:26 +08:00 |
|
Myhs_phz
|
570c30cf1b
|
[Fix] Fix CLI option for results persistence (#1920)
* fix
* fix
* fix
|
2025-03-07 18:24:30 +08:00 |
|
Myhs_phz
|
1585c0adbe
|
[Feature] Evaluation Results Persistence (#1894)
* feat results_station.py
* lint
* feat save_to_station
* feat result_station.py and lint
* feat
* fix
* fix and lint
* fix
* fix subjective processing
* fix
* fix
* style function name
* lint
|
2025-03-05 18:33:34 +08:00 |
|
Songyang Zhang
|
fd6fbf01a2
|
[Update] Support AIME-24 Evaluation for DeepSeek-R1 series (#1888)
* Update
* Update
* Update
* Update
|
2025-02-25 20:34:41 +08:00 |
|
Songyang Zhang
|
46cc7894e1
|
[Feature] Support import configs/models/summarizers from whl (#1376)
* [Feature] Support import configs/models/summarizers from whl
* Update LCBench configs
* Update
* Update
* Update
* Update
* update
* Update
* Update
* Update
* Update
* Update
|
2024-08-01 00:42:48 +08:00 |
|
bittersweet1999
|
8e7ad2e981
|
[Fix] add bc for alignbench summarizer (#1306)
* fix pip version
* fix pip version
* fix alignbench
* fix import error
|
2024-07-12 11:06:20 +08:00 |
|
Fengzhe Zhou
|
a62c613d3e
|
[Sync] bump version 0.2.6+local (#1294)
|
2024-07-06 00:44:06 +08:00 |
|
bittersweet1999
|
68ca48496b
|
[Refactor] Reorganize subjective eval (#1284)
* fix pip version
* fix pip version
* reorganize subjective eval
* reorg sub
* reorg subeval
* reorg subeval
* update subjective doc
* reorg subeval
* reorg subeval
|
2024-07-05 22:11:37 +08:00 |
|
Fengzhe Zhou
|
a32f21a356
|
[Sync] Sync with internal codes 2024.06.28 (#1279)
|
2024-06-28 14:16:34 +08:00 |
|
Fengzhe Zhou
|
d656e818f8
|
[Docs] Remove --no-batch-padding and Use --hf-num-gpus (#1205)
* [Docs] Remove --no-batch-padding and Use -hf-num-gpus
* update
|
2024-05-29 16:30:10 +08:00 |
|
Fengzhe Zhou
|
8ea2c404d7
|
[Feat] enable HuggingFacewithChatTemplate with --accelerator via cli (#1163)
* enable HuggingFacewithChatTemplate with --accelerator via cli
* rm vllm_internlm2_chat_7b
|
2024-05-15 21:51:07 +08:00 |
|
Fengzhe Zhou
|
62dbf04708
|
[Sync] update github workflow (#1156)
|
2024-05-14 22:42:23 +08:00 |
|
Fengzhe Zhou
|
7505b3cadf
|
[Feature] Add huggingface apply_chat_template (#1098)
* add TheoremQA with 5-shot
* add huggingface_above_v4_33 classes
* use num_worker partitioner in cli
* update theoremqa
* update TheoremQA
* add TheoremQA
* rename theoremqa -> TheoremQA
* update TheoremQA output path
* rewrite many model configs
* update huggingface
* further update
* refine configs
* update configs
* update configs
* add configs/eval_llama3_instruct.py
* add summarizer multi faceted
* update bbh datasets
* update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py
* rename class
* update readme
* update hf above v4.33
|
2024-05-14 14:50:16 +08:00 |
|
Fengzhe Zhou
|
19d7e630d6
|
[Sync] Update accelerator (#1122)
(cherry picked from commit 4beb6d9ab655d8a626971841b7acfd9fae9d438f)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
|
2024-05-09 14:32:31 +08:00 |
|
Haodong Duan
|
3a232db471
|
[Deperecate] Remove multi-modal related stuff (#1072)
* Remove MultiModal
* update index.rst
* update README
* remove mmbench codes
* update news
---------
Co-authored-by: Leymore <zfz-960727@163.com>
|
2024-04-26 21:20:14 +08:00 |
|
Fengzhe Zhou
|
8c85edd1cd
|
[Sync] deprecate old mbpps (#1064)
|
2024-04-19 20:49:46 +08:00 |
|
Fengzhe Zhou
|
b39f501563
|
[Sync] update taco (#1030)
|
2024-04-09 17:50:23 +08:00 |
|