jnanliu
|
32a8d81b1d
|
del extract_model param in livemathbench config
|
2025-02-26 06:39:12 +00:00 |
|
Junnan Liu
|
bb4d53e0cb
|
Merge branch 'main' into general-gpass
|
2025-02-26 11:56:45 +08:00 |
|
Junnan Liu
|
22a33d8759
|
[Update] Update LiveMathBench Hard Configs (#1826)
* support G-Pass@k and livemathbench
* fix bugs
* fix comments of GPassKEvaluator
* update saved details of GPassKEvaluator
* update saved details of GPassKEvaluator
* fix eval api configs & update openai_api for ease of debugging
* update huggingface path
* fix method name of G-Pass@k
* fix default value of eval_model_name
* refactor G-Pass@k evaluator
* log generation params for each backend
* fix evaluation resume
* add notimplementerror
* update livemathbench-hard configs
* remove max_out_len from livemathbench_hard_greedy_gen_9befbf.py
* remove max_out_len from livemathbench_hard_gen_9befbf.py
* rename livemathbench_hard_gen_9befbf.py to livemathbench_hard_gen_353ae7.py
* rename livemathbench_hard_greedy_gen_9befbf.py to livemathbench_hard_greedy_gen_353ae7.py
* update livemathbench_gen_9befbf.py
* remove whitespace
* upload livemathbench hard configs
|
2025-02-25 17:24:36 +08:00 |
|
jnanliu
|
b0330ef1c6
|
change repeat to n
|
2025-02-24 08:11:27 +00:00 |
|
jnanliu
|
2349fcff2c
|
delete gpassk_evaluator and fix potential errors
|
2025-02-24 06:25:17 +00:00 |
|
jnanliu
|
8def69369a
|
support dataset repeat and g-pass compute for each evaluator
|
2025-02-23 03:05:42 +00:00 |
|
Junnan Liu
|
8e8d4f1c64
|
[Feature] Support G-Pass@k and LiveMathBench (#1772)
* support G-Pass@k and livemathbench
* fix bugs
* fix comments of GPassKEvaluator
* update saved details of GPassKEvaluator
* update saved details of GPassKEvaluator
* fix eval api configs & update openai_api for ease of debugging
* update huggingface path
* fix method name of G-Pass@k
* fix default value of eval_model_name
* refactor G-Pass@k evaluator
* log generation params for each backend
* fix evaluation resume
* add notimplementerror
|
2024-12-30 16:59:39 +08:00 |
|