Commit Graph

93 Commits

Author SHA1 Message Date
Fengzhe Zhou
f78fcf6eeb
[Docs] Update contamination docs (#775) 2024-01-08 16:37:28 +08:00
tpoisonooo
ba1b684fec
typo(installation.md): fix unzip commands (#774)
* Update installation.md

* Update installation.md
2024-01-08 14:23:35 +08:00
Fengzhe Zhou
3a68083ecc
[Sync] update configs (#734) 2023-12-25 21:59:16 +08:00
Mo Li
0e24f4213e
[Feature] Add NeedleInAHaystack Test Support (#714)
* Add NeedleInAHaystack Test

* Apply pre-commit formatting

* Update configs/eval_hf_internlm_chat_20b_cdme.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* add needle in haystack test

* update needle in haystack test

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-12-23 12:00:51 +08:00
RunningLeon
e34c552282
[Feature] Update configs for evaluating chat models like qwen, baichuan, llama2 using turbomind backend (#721)
* add llama2 test

* fix

* test qwen chat-7b

* test w4

* add baichuan2

* update

* update

* update configs and docs

* update
2023-12-21 18:22:17 +08:00
Hubert
fdf18a3238
[Docs] Update Docker docs (#718)
* [Docs] update docker docs

* [Docs] update docker docs
2023-12-19 23:29:43 +08:00
bittersweet1999
97c2068bd9
[Feature] Add JudgeLLMs (#710)
* add judgellms

* add judgellms

* add sub_size_partition

* add docs

* add ref
2023-12-19 18:40:25 +08:00
Songyang Zhang
637628a70f
[Doc] Update Doc for Alignbench (#707)
* update alignmentbench

* update alignmentbench

* update doc

* update

* update
2023-12-15 15:07:25 +08:00
Fengzhe Zhou
cadab9474f
[Doc] Update contamination docs (#698)
* update contamination docs

* add citation

* Update contamination_eval.md

* Update contamination_eval.md

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-12-13 18:03:39 +08:00
bittersweet1999
6130394165
[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692)
* add features

* add doc string

* add doc string
2023-12-12 20:58:17 +08:00
bittersweet1999
465308e430
[Feature] Add Subjective Evaluation (#680)
* new version of subject

* fixed draw

* fixed draw

* fixed draw

* done

* done

* done

* done

* fixed lint
2023-12-11 22:22:11 +08:00
liyucheng09
05bbce8b08
[Feature] Add Data Contamination Analysis (#639)
* add contamination analysis to ceval

* fix bugs

* add contamination docs

* to pass CI check

* update

---------

Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-08 10:00:11 +08:00
Fengzhe Zhou
79f6449d85
[Doc] Update FAQ (#628)
* update faq

* Update docs/zh_cn/get_started/faq.md

* Update docs/en/get_started/faq.md

* Update docs/zh_cn/get_started/faq.md

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-23 18:19:17 +08:00
Fengzhe Zhou
d949e3c003
[Feature] Add circular eval (#610)
* refactor default, add circular summarizer

* add circular

* update impl

* update doc

* minor update

* no more to be added
2023-11-23 16:45:47 +08:00
Songyang Zhang
5329724b65
[Doc] Update README and requirements. (#622)
* update readme

* update doc
2023-11-22 19:16:54 +08:00
Hubert
8c1483e3ce
[Docs] update ds1000 code eval docs (#618) 2023-11-22 13:37:53 +08:00
Lyu Han
eb56fd6d16
Integrate turbomind python api (#484)
* integrate turbomind python api

* update

* update user guide

* update

* fix according to reviewer's comments

* fix error

* fix linting

* update user guide

* remove debug log

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-21 22:34:46 +08:00
Yang Yong
d3b0d5c4ce
[Feature] Support Lightllm API (#613)
* [Feature] Support Lightllm api

* formatting & renaming

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-21 19:18:40 +08:00
Hubert
91fba2c2e9
[Feat] support humaneval and mbpp pass@k (#598)
* [Feat] support pass@ k

* [Feat] support pass@k

* [Feat] support pass@k

* [Feat] support pass@k

* [Feat] support pass@k

* [Feat] support pass@k docs

* update naming

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-11-16 21:22:06 +08:00
Songyang Zhang
01a0f2f3c7
[Doc] Update README (#582) 2023-11-13 20:39:43 +08:00
Fengzhe Zhou
689ffe5b63
[Feature] Use dataset in local path (#570)
* update commonsenseqa

* update drop

* update flores_first100

* update gsm8k

* update humaneval

* update lambda

* update obqa

* update piqa

* update race

* update siqa

* update story_cloze

* update strategyqa

* update tydiqa

* update winogrande

* update doc

* update hellaswag

* fix obqa

* update collections

* update .zip name
2023-11-13 13:00:37 +08:00
Hubert
95e0da0173
[Docs] add humanevalx dataset link in config (#559)
* [Docs] add humanevalx dataset link in config

* [Docs] add humanevalx dataset link in config

* minor fix
2023-11-10 18:18:58 +08:00
Hubert
36360bdfc3
[Fix] fix filename typo (#549) 2023-11-07 14:00:26 +08:00
Songyang Zhang
239c2a346e
[Feature] Add support for MiniMax API (#548)
* update requirement

* update requirement

* update with minimax

* update api model

* Update readme

* fix error

---------

Co-authored-by: zhangsongyang <zhangsongyang@pjlab.org.cn>
2023-11-06 21:57:32 +08:00
Songyang Zhang
987a711232
[Doc] Update README and FAQ (#535)
* update readme

* update readme and faq
2023-11-02 15:16:37 +08:00
Fengzhe Zhou
dbb20b8270
[Sync] update (#517) 2023-10-27 20:31:22 +08:00
Wei Jueqi
b62842335d
[Doc] Update Subjective docs (#510)
* rename

* add en subdoc

* fix name

* fix writing

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-27 16:27:24 +08:00
Hubert
44c8d6cc60
[Docs] update invalid link in docs (#499) 2023-10-25 13:15:42 +08:00
Leymore
d7ff933a73
[Fix] Use jieba rouge in lcsts (#459)
* use jieba rouge in lcsts

* use rouge_chinese
2023-10-09 10:10:33 +08:00
Tong Gao
119bfd1569
[Refactor] Move fix_id_list to Retriever (#442)
* [Refactor] Move fix_id_list to Retriever

* update

* move to base

* fix
2023-10-07 12:53:41 +08:00
Tong Gao
767c12a660
[Docs] update get_started (#435)
* [Docs] update get_started

* [Docs] Refactor get_started

* update

* add zh FAQ

* add cn doc

* update

* fix dead links

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-07 11:49:40 +08:00
Lyu Han
6738247142
Integrate turbomind inference via its RPC API instead of its python API (#414)
* support tis

* integrate turbomind inference via its RPC API instead of its python API

* update guide

* update ip address spec

* update according to reviewer's comments
2023-10-07 10:27:48 +08:00
Yixiao Fang
524579b5af
[Docs] Add multimodal docs (#334)
* add multimodal docs

* fix lint

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-09-22 18:58:29 +08:00
Tong Gao
3e980a9737
[Docs] Add intro figure to README (#413)
* [Docs] Add intro figure to README

* update
2023-09-19 20:19:35 +08:00
philipwangOvO
f57c0702f7
[Docs] Readme in longeval (#389)
* [Docs] Readme in longeval

* [Docs] Readme in longeval

* [Docs] Readme in longeval

* [Docs] Readme in longeval

* [Docs] Readme in longeval

* [Docs] Readme in longeval

* [Docs] Readme in longeval
2023-09-18 17:06:00 +08:00
Hubert
a11cb45c83
[Feat] implementation for support promptbench (#239)
* [Feat] support adv_glue dataset for adversarial robustness

* reorg files

* minor fix

* minor fix

* support prompt bench demo

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix
2023-09-15 15:06:53 +08:00
Tong Gao
4d89533fbc
[Docs] Add FAQ (#384)
* [Docs] Add FAQ

* [Docs] Add FAQ
2023-09-12 12:11:38 +08:00
Tong Gao
b9b145c335
[Docs] Fix incorrect name in get_started (#380) 2023-09-11 16:10:09 +08:00
liushz
63ced828d8
Update get_started.md (#377) 2023-09-11 10:58:17 +08:00
Leymore
2c915218e8
[Feaure] Add new models: baichuan2, tigerbot, vicuna v1.5 (#373)
* add bag of new models: baichuan2, tigerbot, vicuna v1.5

* update

* re-organize models

* update readme

* update
2023-09-08 15:41:20 +08:00
Songyang Zhang
3871188c89
[Feat] Update URL (#368) 2023-09-07 17:29:50 +08:00
Songyang Zhang
a05daab911
[Doc] Update Overview (#242)
* Update news

* update overview

* add framework

* update index

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-09-07 14:21:39 +08:00
Hubert
2c71b0f6f3
[Docs] update code evaluator docs (#354)
* [Docs] update code evaluator docs

* minor fix

* minor fix
2023-09-06 17:52:22 +08:00
Haodong Duan
b95aea75ce
[Doc] Update MMBench.md (#336)
Fix minor problem in the doc
2023-09-01 13:44:45 +08:00
Tong Gao
166022f568
[Docs] Update docs for new entry script (#246)
* update docs

* update docs

* update

* update en docs

* update

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-08-31 16:43:55 +08:00
Leymore
c26ecdb1b0
[Feature] Add and apply update suffix tool (#280)
* add and apply update suffix tool

* add dataset suffix updater as precommit hook

* update workflow

* update scripts

* update ci

* update

* ci with py3.8

* run in serial

* update bbh

* use py 3.10

* update pre commit zh cn
2023-08-28 17:35:04 +08:00
Songyang Zhang
b2d602f42b
Update README.md (#262)
* Update README.md

* update news and readme

* update

* update

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-08-25 18:53:35 +08:00
liushz
02ce139bc6
[Feature] Add Tree-of-Thought method (#173)
* Add ToT method

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update chain_of_thought.md

* Update icl_tot_inferencer.py

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2023-08-23 12:23:05 +08:00
Leymore
c0e58632ca
[Doc] Add summarizer doc (#231)
* add summarizer doc

* update

* update doc

* Apply suggestions from code review

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-23 11:18:01 +08:00
Ezra-Yu
17ccaa5980
[Feat] Add codegeex2 and Humanevalx (#210)
* add codegeex2

* add humanevalx dataset

* add evaluator

* update evaluator

* update configs

* update clean code

* update configs

* fix lint

* remove sleep

* fix lint

* update docs

* fix lint
2023-08-17 11:03:16 +08:00