Commit Graph

129 Commits

Author SHA1 Message Date
Tong Gao
c6a3494993
[Fix] requirements (#229) 2023-08-18 14:34:20 +08:00
Yuan Liu
90c07a3dfd
[Fix]: Fix name (#223) 2023-08-17 18:30:48 +08:00
Yuan Liu
3d49a20b95
[Feature]: Add launch script (#222) 2023-08-17 18:26:01 +08:00
Yixiao Fang
0fa2482661
[Feature] Support SEED-Bench (#203)
* support seedbench

* update docstrings

* update

* update

* update

* update according to review

* rebase

* fix lint

* update
2023-08-17 17:24:02 +08:00
Yuan Liu
ae3c1869da
[Feature]: Add other public datasets config (#214)
* [Feature]: Add flickr30k

* [Feature]: Add GQA

* [Feature]: Add OCR VQA

* [Feature]: Add OK VQA

* [Feature]: Add text vqa

* [Feature]: Add other vqa
2023-08-17 11:11:26 +08:00
Ezra-Yu
17ccaa5980
[Feat] Add codegeex2 and Humanevalx (#210)
* add codegeex2

* add humanevalx dataset

* add evaluator

* update evaluator

* update configs

* update clean code

* update configs

* fix lint

* remove sleep

* fix lint

* update docs

* fix lint
2023-08-17 11:03:16 +08:00
Hubert
0fe2366a72
[Feat] support adv_glue dataset for adversarial robustness (#205)
* [Feat] support adv_glue dataset for adversarial robustness

* reorg files

* minor fix

* minor fix
2023-08-16 18:42:06 +08:00
Ezra-Yu
d7cb39581a
update conf (#212) 2023-08-16 15:22:14 +08:00
Yuan Liu
78df9bd0cb
[Feature]: Add other public datasets (#206)
* [Feature]: Refactor class name

* [Feature]: Add minigpt-4 coco caption

* [Feature]: Update minigpt-4 coco caption

* [Feature]: Add MiniGPT-4 ScienceQA

* [Feature]: Add minigpt-4 vqav2

* [Feature]: Add VSR

* [Feature]: Revert task to previous version
2023-08-16 11:37:26 +08:00
Yike Yuan
3a46b6c64f
[Fix] Fix bugs of multiple rounds of inference when using mm_eval (#201) 2023-08-16 11:15:11 +08:00
Leymore
4fc1701209
[Doc] update readme (#196)
* update readme

* Apply suggestions from code review

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-11 18:43:41 +08:00
Hubert
7c393192af
[Fix] fix bug for postprocessor (#195)
* [Fix] fix bug for postprocessor

* minor fix
2023-08-11 18:41:12 +08:00
Tong Gao
10cbc2b175
Bump version to 0.1.2 (#190) 2023-08-11 17:43:14 +08:00
Tong Gao
bf79ff1c6d
[Feature] Add LEval datasets
Co-authored-by: kennymckormick <dhd@pku.edu.cn>
2023-08-11 17:38:31 +08:00
Hubert
8d9cee060f
[Feat] update postprocessor to get first option more accurately (#193)
* [Feat] update postprocessor to get first option

* minor fix

* minor fix
2023-08-11 17:33:00 +08:00
Leymore
14332e08fd
[Feature] add llama-oriented dataset configs (#82)
* add llama-oriented dataset configs

* update

* revert cvalues & update llama_example
2023-08-11 12:48:05 +08:00
Tong Gao
e464265cf8
[Docs] Update contribution guide & toc, improve user experience (#188)
* [Docs] Update contribution guide & toc

* update

* Update docs/en/notes/contribution_guide.md

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* update

* update

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-08-11 11:36:09 +08:00
Hubert
5a9539f375
[Feat] add safety to collections (#185)
* [Feat] add safety to collections

* minor fix
2023-08-11 11:19:26 +08:00
Zaida Zhou
f4c70ba6c3
[Feature] Support filtering specified levels message (#187)
* Support filtering message

* minor fix
2023-08-11 10:46:46 +08:00
Songyang Zhang
99ae786598
[Feature] update news (#186)
* update news

* update

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-08-10 18:52:09 +08:00
Zaida Zhou
f256abffd3
[Enhancement] Skip invalid keys to avoid requesting API (#184)
* Skip invalid keys to avoid requesting API

* get expected key

* print warning info
2023-08-10 18:41:43 +08:00
Tong Gao
0406e4e7ed
[Docs] Enhance issue template (#183) 2023-08-10 17:02:58 +08:00
Ma Zerun
59bf56349c
[Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU (#148)
* [Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU

* Fix UT

* Update according to comments
2023-08-10 16:53:03 +08:00
Tong Gao
312095de9d
[Fix] meta template & unit tests (#170) 2023-08-10 16:49:13 +08:00
liushz
ed248af136
[Fix] Fix some sc errors (#177)
* Update sc

* Update sc doc

* Apply suggestions from code review

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-08-10 16:40:32 +08:00
Tong Gao
2931f3dcb8
[Enhancement] Add humaneval postprocessor for GPT models & eval config for GPT4, enhance the original humaneval postprocessor (#129)
* [Enhancement] Enhance humaneval postprocessor

* add human-eval testcase

* update

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-08-10 16:31:12 +08:00
Songyang Zhang
3f36db3b06
[Feature] Support turbomind (#166)
* support turbomind

* update doc

* Update docs/en/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* update

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-10 16:25:11 +08:00
Leymore
e7fc54baf1
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
* add Xiezhi SQuAD2.0 ANLI; update WSC

* update

* update

* update doc string
2023-08-10 14:04:18 +08:00
Yuan Liu
a205629ff3
[Feature]: Refactor input and output (#176)
* [Feature]: Refactor input and output

* [Feature]: Update tasks
2023-08-10 14:01:28 +08:00
Leymore
876ade71a5
[Fix] Fix AGIEval multiple choice (#137)
* update agieval data

* rename variables
2023-08-10 11:38:24 +08:00
dependabot[bot]
0555d59a6a
Bump requests from 2.28.1 to 2.31.0 (#178)
Bumps [requests](https://github.com/psf/requests) from 2.28.1 to 2.31.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.28.1...v2.31.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-09 19:41:09 +08:00
Tong Gao
e6194df29e
[Fix] Use a copy of the config object in Task (#174) 2023-08-09 15:24:49 +08:00
Haodong Duan
d5d4f47371
[API] Refine OpenAI (#175) 2023-08-09 12:38:57 +08:00
Zaida Zhou
af436f5951
[Feature] Calculate max_out_len without hard code for OpenAI model (#158)
* calulate max_out_len without hard code

* set default value

* update configs

* Update configs/eval_gpt3.5.py

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-08 15:16:56 +08:00
Yuan Liu
2f1949e7a1
[Feature]: Add mm suport for local (#169) 2023-08-08 14:21:58 +08:00
Songyang Zhang
5b80d83866
[Docs] update readme (#165) 2023-08-08 12:49:04 +08:00
Haodong Duan
6ca2be6626
[Script] Add scripts to evaluate MMBench (#161)
* update

* update

* Update README.md

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* refine

* update default

* update CN README

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-08-07 16:53:36 +08:00
Tong Gao
1bab316624
update internal readme (#162) 2023-08-07 14:27:15 +08:00
Tong Gao
bbdedc6c95
[Enhancement] Optimize OpenAI models (#128)
* [Feature] Enhance OpenAI API, add example config for GPT evaluation
2023-08-03 14:55:16 +08:00
Haodong Duan
d17a5b94fa
[Refine] Refine PR #122 (#123)
* update

* update
2023-08-03 14:54:38 +08:00
Yuan Liu
191a3f6f9d
[Feature]: Use multimodal (#73)
* [Feature]: Add minigpt-4

* [Feature]: Add mm local runner

* [Feature]: Add instructblip

* [Feature]: Delete redundant file

* [Feature]: Delete redundant file

* [Feature]: Add README to InstructBLIP

* [Feature]: Update MiniGPT-4

* [Fix]: Fix lint

* [Feature]add omnibenchmark readme (#49)

* add omnibenchmark readme

* fix

* Update OmniMMBench.md

* Update OmniMMBench.md

* Update OmniMMBench.md

* [Fix]: Refine name (#54)

* [Feature]: Unify out and err

* [Fix]: Fix lint

* [Feature]: Rename to mmbench and change weight path

* [Feature]: Delete Omni in instructblip

* [Feature]: Check the avaliablity of lavis

* [Fix]: Fix lint

* [Feature]: Refactor MM

* [Refactor]: Refactor path

* [Feature]: Delete redundant files

* [Refactor]: Delete redundant files

---------

Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>
2023-08-03 11:07:50 +08:00
Zaida Zhou
289e0567bd
Fix typo in readme (#152) 2023-08-02 19:01:39 +08:00
Leymore
bbe45c68a3
[Doc] update acknowledgements (#147) 2023-08-02 10:16:53 +08:00
Tong Gao
8b163bd8e9
[Feature] Several enhancements (#142) 2023-08-01 18:19:49 +08:00
Tong Gao
c00179d46b
[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130)
* [Feature] Support evaluating acc based on minimum edit distance, update SIQA

* update
2023-08-01 14:24:27 +08:00
Ezra-Yu
e9b7b8ab02
[DOC] Add metric doc (#118)
* update

* update

* update metric docs

* update index.rst

* update metrics
2023-08-01 11:47:04 +08:00
Songyang Zhang
d860b61d04
[Enhancement] Update README.md (#119)
* Update README.md

* update README_zh-CN.md

* update get_started

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-07-31 18:26:46 +08:00
Leymore
262ab794fb
[Docs] Update prompt docs (#46)
* [Docs] Update prompt docs

* update

* [Docs] Prompt docs (#112)

* update docs

* update

* update

* Update en prompt template

* Update en prompt doc

* fix

* fix

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-07-29 00:46:13 +08:00
Anakin Skywalker
e04f88424d
edit doc (#125) 2023-07-28 17:33:51 +08:00
Leymore
d862f570aa
[Feature] Add SC (#126)
* add self-consistency

* add CoT method Self-Consistency

* fix typo error and update openicl_eval

* add tydiQA-GoldP task

* fix sc

* rename gsm8k_sc

* fix sc

* add self-consistency doc

* refine sc

---------

Authored-by: liushz <qq1791167085@163.com>
2023-07-28 17:29:37 +08:00