Fengzhe Zhou
aa2dd2b58c
[Format] Add config lints ( #892 )
2024-05-14 15:35:58 +08:00
Xu Song
3dbba11945
[Feat] Support dataset_suffix check for mixed configs ( #973 )
...
* [Feat] Support dataset_suffix check for mixed configs
* update mixed suffix
* update suffix
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-14 15:03:28 +08:00
Fengzhe Zhou
7505b3cadf
[Feature] Add huggingface apply_chat_template ( #1098 )
...
* add TheoremQA with 5-shot
* add huggingface_above_v4_33 classes
* use num_worker partitioner in cli
* update theoremqa
* update TheoremQA
* add TheoremQA
* rename theoremqa -> TheoremQA
* update TheoremQA output path
* rewrite many model configs
* update huggingface
* further update
* refine configs
* update configs
* update configs
* add configs/eval_llama3_instruct.py
* add summarizer multi faceted
* update bbh datasets
* update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py
* rename class
* update readme
* update hf above v4.33
2024-05-14 14:50:16 +08:00
Mo Li
6c711cb262
[Fix] Fix Needlebench Summarizer ( #1143 )
...
* update few-shot example
* add 128k
2024-05-13 15:59:34 +08:00
bittersweet1999
5432dfc1ff
fix multiround ( #1146 )
2024-05-13 15:58:39 +08:00
bittersweet1999
833a35140b
[Fix] fix alpacaeval while add caching path ( #1139 )
...
* fix alpacaeval
* fix alpacaeval
2024-05-11 14:02:26 +08:00
Fengzhe Zhou
19d7e630d6
[Sync] Update accelerator ( #1122 )
...
(cherry picked from commit 4beb6d9ab655d8a626971841b7acfd9fae9d438f)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-05-09 14:32:31 +08:00
Alexander Lam
a71122ee18
[Feature] Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs ( #1123 )
...
* added qwen moe and mixtral 8x22 model configs
* updated README files news section
2024-05-09 11:04:26 +08:00
Mo Li
cb080fa7de
[Fix] Fix NeedleBench Summarizer Typo ( #1125 )
...
* update needleinahaystack eval docs
* update needlebench summarizer
* fix english docs typo
2024-05-08 20:00:15 +08:00
bittersweet1999
826d8307ac
fix links ( #1120 )
2024-05-08 15:13:18 +08:00
JuhaoLiang
d2c40e5648
[Feature] Add AceGPT-MMLUArabic benchmark ( #1099 )
...
* add AceGPT-MMLUArabic benchmark
* update readme and fix lint issue
* remove unused package
* add MMLUArabic zero-shot settings
* rename filename and update readme
2024-05-08 15:00:26 +08:00
Fangyu Lei
862044fb7d
[Feature] Add S3Eval Dataset ( #916 )
...
* s3eval_branch
* update s3eval
2024-05-06 19:41:52 +08:00
Xu Song
d501710155
[Fix] Fix AGIEval chinese sets ( #972 )
...
* [Fix] Fix AGIEval chinese sets
* Create agieval_gen_617738.py
* [Fix] Fix AGIEval chinese sets
* Restore agieval_gen_64afd3.py
* Update agieval_gen.py
* Create agieval_mixed_0fa998.py
* Update agieval_mixed.py
2024-05-06 15:31:42 +08:00
Yggdrasill7D6
af10ecc272
add mgsm datasets ( #1081 )
...
* add mgsm datasets
* fix lint
* fix lint
* update mgsm
* update mgsm
* ease code spell
* update
* update
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-06 15:29:34 +08:00
klein
153c4fc988
[Feature] update drop dataset from openai simple eval ( #1092 )
...
* [Feature] update drop dataset from openai simple eval
* update drop template presentation
* update
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-05-06 13:37:08 +08:00
Fengzhe Zhou
d43392a3bb
[Feature] Add mmlu prompt from simple_evals, openai ( #1074 )
...
* add mmlu prompt from simple_evals, openai
* return empty str on failure
2024-05-06 13:26:26 +08:00
Yang Yong
53fe390454
fix LightllmApi workers bug ( #1113 )
2024-04-30 22:09:22 +08:00
Fengzhe Zhou
baed2ed9b8
update pre-commit ( #891 )
2024-04-30 10:59:41 +08:00
Alexander Lam
35c94d0cde
[Feature] Adding support for LLM Compression Evaluation ( #1108 )
...
* fixed formatting based on pre-commit tests
* fixed typo in comments; reduced the number of models in the eval config
* fixed a bug in LLMCompressionDataset, where setting samples=None would result in passing test[:None] to load_dataset
* removed unnecessary variable in _format_table_pivot; changed lark_reporter message to English
2024-04-30 10:51:01 +08:00
Ikko Eltociear Ashimine
9c79224b39
[Docs] Update README.md ( #1110 )
...
requiresments -> requirements
2024-04-30 00:45:33 +08:00
bittersweet1999
3de48e9b35
[Bug] Fix CMB dataset ( #1106 )
2024-04-30 00:33:43 +08:00
Songyang Zhang
063f5f5f49
[Update] Update performance of common benchmarks ( #1109 )
...
* [Update] Update performance of common benchmarks
* [Update] Update performance of common benchmarks
* [Update] Update performance of common benchmarks
2024-04-30 00:09:08 +08:00
liushz
a6f67e1a65
[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README ( #1103 )
...
* Add Math Evaluation with Judge Model Evaluator
* Add Math Evaluation with Judge Model Evaluator
* Add Math Evaluation with Judge Model Evaluator
* Add Math Evaluation with Judge Model Evaluator
* Fix Llama-3 meta template
* Fix MATH with JudgeLM Evaluation
* Fix MATH with JudgeLM Evaluation
* Fix MATH with JudgeLM Evaluation
* Fix MATH with JudgeLM Evaluation
---------
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-04-28 21:58:58 +08:00
bittersweet1999
0b7de67c4a
fix prompt template ( #1104 )
2024-04-28 21:54:30 +08:00
Lyu Han
1013dce60c
adapt to lmdeploy v0.4.0 ( #1073 )
...
* adapt to lmdeploy v0.4.0
* compatible
2024-04-28 19:57:40 +08:00
Yggdrasill7D6
58a57a4c45
[Feature] add support for Flames datasets ( #1093 )
...
* add flames datasets
* fix lint
* rm quota
* add judgemodel info and fix os path
* support flames dataset
* support flames dataset
---------
Co-authored-by: bittersweet1999 <1487910649@qq.com>
2024-04-28 18:56:24 +08:00
Mo Li
76dd814c4d
[Doc] Update NeedleInAHaystack Docs ( #1102 )
...
* update NeedleInAHaystack Test Docs
* update docs
2024-04-28 18:51:47 +08:00
dmitrysarov
cce5b6fbb6
fix output typing, change mutable list to immutable tuple ( #989 )
...
* fix output typing, change mutable list to immutable tuple
* import missed type
* format
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 23:07:34 +08:00
binary-husky
701ecbb292
[Fix] python path bug ( #1063 )
...
* fix relative path bug
* format
---------
Co-authored-by: hmp <505030475@qq.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 21:58:45 +08:00
Wang Xingjin
048d41a1c4
add vllm get_ppl ( #1003 )
...
* add vllm get_ppl
* add vllm get_ppl
* format
---------
Co-authored-by: xingjin.wang <xingjin.wang@mihoyo.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 21:31:56 +08:00
Haodong Duan
3a232db471
[Deperecate] Remove multi-modal related stuff ( #1072 )
...
* Remove MultiModal
* update index.rst
* update README
* remove mmbench codes
* update news
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 21:20:14 +08:00
Francis-llgg
f1ee11de14
[Feature] Add gpqa prompt from simple_evals, openai ( #1080 )
...
* add gpqa_openai_simple_eval
* 触发CI构建
* reorg
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 20:13:00 +08:00
klein
e4830a6926
Update CIBench ( #1089 )
...
* modify the requirements/runtime.txt: numpy==1.23.4 --> numpy>=1.23.4
* update cibench: dataset and evluation
* cibench summarizer bug
* update cibench
* move extract_code import
---------
Co-authored-by: zhangchuyu@pjlab.org.cn <zhangchuyu@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-26 18:46:02 +08:00
bittersweet1999
e404b72c52
[Feature] support arenahard evaluation ( #1096 )
...
* support arenahard
* support arenahard
* support arenahard
2024-04-26 15:42:00 +08:00
bittersweet1999
6ba1c4937d
[Feature] Support Math evaluation via judgemodel ( #1094 )
...
* support openai math evaluation
* support openai math evaluation
* support openai math evaluation
* support math llm judge
* support math llm judge
2024-04-26 14:56:23 +08:00
Jingming Zhuo
41196c48ae
Add humaneval prompt from simple_evals, openai ( #1076 )
...
* [Feature] Add IFEval
* add humaneval prompt from simple_evals, openai
2024-04-24 17:40:50 +08:00
liushz
17735f0c13
Fix Llama-3 meta template ( #1079 )
...
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-04-24 16:46:25 +08:00
Ke Bao
81d0e4d793
[Feature] Add lmdeploy tis python backend model ( #1014 )
...
* add lmdeploy tis python backend model
* fix pr check
* update
2024-04-23 14:27:11 +08:00
Fengzhe Zhou
8fe7b271cc
[Fix] Fix sequential runner ( #1070 )
2024-04-23 11:31:10 +08:00
Fengzhe Zhou
004ed79593
[Feature] Add TheoremQA with 5-shot ( #1048 )
...
* add TheoremQA with 5-shot
* cherry pick from add-huggingface-above-v4.33, good TheoremQA results
2024-04-22 15:22:04 +08:00
Fengzhe Zhou
a256753221
[Feature] Add LLaMA-3 Series Configs ( #1065 )
...
* add LLaMA-3 Series configs
* update readme
2024-04-22 14:39:31 +08:00
bittersweet1999
6f98c8d9ab
[Fix] Fix MultiRound Subjective Evaluation( #1043 )
...
* fix multiround
* fix
2024-04-22 12:06:03 +08:00
Fengzhe Zhou
8c85edd1cd
[Sync] deprecate old mbpps ( #1064 )
2024-04-19 20:49:46 +08:00
Robin Chen
c172401323
[Fix] Fixed repeated loading of VLLM ( #1051 )
...
* [fix]Fixed the issue caused by the repeated loading of VLLM model during task segmentation.
* [fix] avoid TypeError: VLLM.__init__() got an unexpected keyword argument 'tokenizer_only'
* restore .pre-commit-config.yaml
* restore opencompass/tasks/openicl_infer.py
---------
Co-authored-by: IcyFeather <mengzhuo.happy@gmail.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-17 20:36:08 +08:00
Songyang Zhang
629836146a
[Doc] Update README ( #1053 )
...
* [Update] Update readme
* [Update] Update readme
* [Update] Update readme
2024-04-16 19:54:12 +08:00
Fengzhe Zhou
881bdbf6bd
[Sync] Bump version to 0.2.4 ( #1052 )
...
(cherry picked from commit 16ac6306c72fa202173289b55eaefe85e0fcb73c)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2024-04-16 18:09:46 +08:00
Fengzhe Zhou
7a41951dda
[Fix] logger.error -> logger.debug in OpenAI wrapper ( #1050 )
...
* logger.error -> logger.info in OpenAI
* logger.info -> logger.debug in OpenAI
2024-04-15 21:08:13 +08:00
liuwei130
a00e57296f
[Feature] Add ChemBench ( #1032 )
...
* add ChemBench
* update results
* molbench -> ChemBench
---------
Co-authored-by: Leymore <zfz-960727@163.com>
2024-04-12 08:46:26 +08:00
Fengzhe Zhou
bd7c11bb89
[Fix] Update setup.py install_requires ( #1036 )
2024-04-11 11:11:34 +08:00
Fengzhe Zhou
b39f501563
[Sync] update taco ( #1030 )
2024-04-09 17:50:23 +08:00