Commit Graph

144 Commits

Author SHA1 Message Date
Leymore
a1782f9a08
[Fix] triviaqa & nq postprocess (#350) 2023-09-04 15:24:52 +08:00
Tong Gao
ce65d3393b
[Sync] Use finally to clean up temp files (#337) 2023-09-04 15:20:16 +08:00
Yixiao Fang
2cd994c3d1
[Fix] add import check of multimodal (#352) 2023-09-04 14:41:07 +08:00
Leymore
8774465a8f
[Enhancement] ignore ZeroRetriever error when id_list provided (#340) 2023-09-04 11:12:16 +08:00
Yuanhan Zhang
f2dd98ca7a
[Feat] Support LLaVA and mPLUG-Owl (#331)
* refine gitignore

* [Feature]: Add minigpt-4

* [Feature]: Add mm local runner

* [Feature]: Add instructblip

* add otter and llama-adapter

* add owl

* add llama2-adapter and owl

* lint

* [Feature]: Add minigpt-4

* [Feature]: Add instructblip

* add otter and llama-adapter

* add owl

* add llama2-adapter and owl

* lint

* lint

* update

* lint

* lint

* add __init__.py

* update

* update

* update

---------

Co-authored-by: liuyuan <3463423099@qq.com>
2023-09-01 23:32:05 +08:00
Leymore
e810974068
[Fix] Fix when missing both pad and eos token (#287)
* fix when missing both pad and eos token

* update pad_token_id impl
2023-08-31 16:53:39 +08:00
Li Bo
a4d6840739
[Feat] Add Otter to OpenCompass MMBench Evaluation (#232)
* add otter model for opencompass mmbench

* add docs

* add readme docs

* debug for otter opencomass eval

* delete unused folders

* change to default data path

* remove unused files

* remove unused files

* update

* update config file

* flake8 lint formated and add prompt generator

* add prompt generator to config

* add a specific postproecss

* add post processor

* add post processor

* add post processor

* update according to suggestions

* remove unused redefinition
2023-08-31 12:55:53 +08:00
Leymore
7ca6ba625e
[Feature] Add qwen & qwen-chat support (#286)
* add and apply update suffix tool

* add tool doc

* add qwen configs

* add cmmlu

* rename bbh

* update datasets

* delete

* update hf_qwen_7b.py
2023-08-31 11:29:05 +08:00
Hubert
fd389e2d78
[Feat] support codellama and preds collection tools (#335) 2023-08-31 11:14:42 +08:00
Tong Gao
9058be07b8
[Feature] Simplify entry script (#204)
* [Feature] Simply entry script

* update
2023-08-25 17:36:30 +08:00
Tong Gao
f480b72703
[Feature] Support model-bound prediction postprocessor, use it in Claude (#268)
* [Feature] Support model-bound text postprocessor, add claude as an example

* update

* update

* minor fix

---------

Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>
2023-08-25 16:12:21 +08:00
Yike Yuan
3f601f420b
[Feat] Support public dataset of visualglm and llava. (#265)
* [Feat] Add public dataset support of VisualGLM.

* [Feat] Refactor LLaVA.

* [Feat] Add public dataset support of LlaVA.

* [Fix] Add  arg.
2023-08-25 15:44:32 +08:00
Yuan Liu
dc6e54f6f4
[Feature]: Verify the acc of these public datasets (#269)
* [Feature]: Refactor public dataset eval

* [Feature]: Verify public dataset acc
2023-08-25 15:01:58 +08:00
philipwangOvO
3f37c40aa3
[Dataset] Refactor LEval 2023-08-25 11:46:23 +08:00
Tong Gao
60c2d3d76b
[Feature] Add Claude support (#253)
* [Feature] Add Claude support

* [Feature] Add Claude support

* Update opencompass/models/claude_api.py

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>

* raise import erorr

---------

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-08-24 14:29:45 +08:00
Yuan Liu
343f785b07
[Feature]: Add Flamingo (#258)
* [Feature]: Add Openflamingo MMBench

* [Fix]: Fix import error

* [Fix]: Revert task config

* [Fix]: Fix path bug
2023-08-24 14:11:29 +08:00
LZHgrla
77745a84ea
[Fix] Fix bugs for PeftModel generate (#252)
* fix bugs

* fix typo
2023-08-24 14:07:33 +08:00
Tong Gao
bd47a00f27
[Fix] use sympy only when necessary (#255) 2023-08-24 10:15:20 +08:00
Tong Gao
01372a4806
update (#251) 2023-08-23 16:25:23 +08:00
Yixiao Fang
1034c487ef
[Refactor] Refactor instructblip (#227)
* refactor instructblip

* add post processor

* add forward

* fix lint

* update

* update
2023-08-23 15:33:59 +08:00
liushz
02ce139bc6
[Feature] Add Tree-of-Thought method (#173)
* Add ToT method

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update ToT

* Update chain_of_thought.md

* Update icl_tot_inferencer.py

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2023-08-23 12:23:05 +08:00
Leymore
ff5ab92331
[Feature] Add llama2 native implements (#235)
* add llama2 native implements

* rename configs/eval_llama_7b.py

---------

Co-authored-by: zhoufengzhe <zhoufengzhe@pjlab.org.cn>
2023-08-23 11:33:25 +08:00
Leymore
fdc69f9d58
[Fix] local runner debug (#238) 2023-08-21 16:58:36 +08:00
Yike Yuan
8d368d1cd6
[Feat] Support visualglm and llava for MMBench evaluation. (#211)
* [Feat] Support visualglm inference on MMBench.

* [Feat] Support llava inference on MMBench.

* [Fix] Fix pre-commit format.

* [Fix] Add docstring for llava

* [Fix] Fix multi-process inference error of LlaVA and add comments.
1. Set `low_cpu_mem_usage` to False to address device issue.
2. Add docstring and type hints.
3. Rename class and remove registry.

* [Fix] Pre-commit fix.

* [Fix] add forward entry, add dynamic import to seedbench

* [Fix] Fix pre-commit.

* [Fix] Fix missing context.

* [Fix] Fix docstring.
2023-08-21 15:57:30 +08:00
Yike Yuan
a6552224cb
[Feat] Support multi-modal evaluation on MME benchmark. (#197)
* [Feat] Support multi-modal evaluation on MME benchmark.

* [Fix] Remove debug code.

* [Fix] Remove redundant codes and add type hints.

* [Fix] Rename in config.

* [Fix] Rebase main.

* [Fix] Fix isort and yapf conflict.
2023-08-21 15:53:20 +08:00
philipwangOvO
3b29aaee2b
[Fix] bin_trim (#237)
Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>
2023-08-21 15:44:49 +08:00
philipwangOvO
655a807f4b
[Dataset] LongBench (#236)
Co-authored-by: wangchonghua <wangchonghua@pjlab.org.cn>
2023-08-21 14:15:20 +08:00
Yixiao Fang
0fa2482661
[Feature] Support SEED-Bench (#203)
* support seedbench

* update docstrings

* update

* update

* update

* update according to review

* rebase

* fix lint

* update
2023-08-17 17:24:02 +08:00
Ezra-Yu
17ccaa5980
[Feat] Add codegeex2 and Humanevalx (#210)
* add codegeex2

* add humanevalx dataset

* add evaluator

* update evaluator

* update configs

* update clean code

* update configs

* fix lint

* remove sleep

* fix lint

* update docs

* fix lint
2023-08-17 11:03:16 +08:00
Hubert
0fe2366a72
[Feat] support adv_glue dataset for adversarial robustness (#205)
* [Feat] support adv_glue dataset for adversarial robustness

* reorg files

* minor fix

* minor fix
2023-08-16 18:42:06 +08:00
Yuan Liu
78df9bd0cb
[Feature]: Add other public datasets (#206)
* [Feature]: Refactor class name

* [Feature]: Add minigpt-4 coco caption

* [Feature]: Update minigpt-4 coco caption

* [Feature]: Add MiniGPT-4 ScienceQA

* [Feature]: Add minigpt-4 vqav2

* [Feature]: Add VSR

* [Feature]: Revert task to previous version
2023-08-16 11:37:26 +08:00
Yike Yuan
3a46b6c64f
[Fix] Fix bugs of multiple rounds of inference when using mm_eval (#201) 2023-08-16 11:15:11 +08:00
Hubert
7c393192af
[Fix] fix bug for postprocessor (#195)
* [Fix] fix bug for postprocessor

* minor fix
2023-08-11 18:41:12 +08:00
Tong Gao
10cbc2b175
Bump version to 0.1.2 (#190) 2023-08-11 17:43:14 +08:00
Tong Gao
bf79ff1c6d
[Feature] Add LEval datasets
Co-authored-by: kennymckormick <dhd@pku.edu.cn>
2023-08-11 17:38:31 +08:00
Hubert
8d9cee060f
[Feat] update postprocessor to get first option more accurately (#193)
* [Feat] update postprocessor to get first option

* minor fix

* minor fix
2023-08-11 17:33:00 +08:00
Leymore
14332e08fd
[Feature] add llama-oriented dataset configs (#82)
* add llama-oriented dataset configs

* update

* revert cvalues & update llama_example
2023-08-11 12:48:05 +08:00
Hubert
5a9539f375
[Feat] add safety to collections (#185)
* [Feat] add safety to collections

* minor fix
2023-08-11 11:19:26 +08:00
Zaida Zhou
f4c70ba6c3
[Feature] Support filtering specified levels message (#187)
* Support filtering message

* minor fix
2023-08-11 10:46:46 +08:00
Zaida Zhou
f256abffd3
[Enhancement] Skip invalid keys to avoid requesting API (#184)
* Skip invalid keys to avoid requesting API

* get expected key

* print warning info
2023-08-10 18:41:43 +08:00
Ma Zerun
59bf56349c
[Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU (#148)
* [Feature] Support CUDA_VISIBLE_DEVICES and multiple tasks on one GPU

* Fix UT

* Update according to comments
2023-08-10 16:53:03 +08:00
Tong Gao
312095de9d
[Fix] meta template & unit tests (#170) 2023-08-10 16:49:13 +08:00
liushz
ed248af136
[Fix] Fix some sc errors (#177)
* Update sc

* Update sc doc

* Apply suggestions from code review

Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>

---------

Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
Co-authored-by: Hubert <42952108+yingfhu@users.noreply.github.com>
2023-08-10 16:40:32 +08:00
Tong Gao
2931f3dcb8
[Enhancement] Add humaneval postprocessor for GPT models & eval config for GPT4, enhance the original humaneval postprocessor (#129)
* [Enhancement] Enhance humaneval postprocessor

* add human-eval testcase

* update

* update

---------

Co-authored-by: Leymore <zfz-960727@163.com>
2023-08-10 16:31:12 +08:00
Songyang Zhang
3f36db3b06
[Feature] Support turbomind (#166)
* support turbomind

* update doc

* Update docs/en/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/advanced_guides/evaluation_turbomind.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* update

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-10 16:25:11 +08:00
Leymore
e7fc54baf1
[Feature] Add Xiezhi SQuAD2.0 ANLI (#101)
* add Xiezhi SQuAD2.0 ANLI; update WSC

* update

* update

* update doc string
2023-08-10 14:04:18 +08:00
Yuan Liu
a205629ff3
[Feature]: Refactor input and output (#176)
* [Feature]: Refactor input and output

* [Feature]: Update tasks
2023-08-10 14:01:28 +08:00
Leymore
876ade71a5
[Fix] Fix AGIEval multiple choice (#137)
* update agieval data

* rename variables
2023-08-10 11:38:24 +08:00
Tong Gao
e6194df29e
[Fix] Use a copy of the config object in Task (#174) 2023-08-09 15:24:49 +08:00
Haodong Duan
d5d4f47371
[API] Refine OpenAI (#175) 2023-08-09 12:38:57 +08:00
Zaida Zhou
af436f5951
[Feature] Calculate max_out_len without hard code for OpenAI model (#158)
* calulate max_out_len without hard code

* set default value

* update configs

* Update configs/eval_gpt3.5.py

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-08-08 15:16:56 +08:00
Yuan Liu
2f1949e7a1
[Feature]: Add mm suport for local (#169) 2023-08-08 14:21:58 +08:00
Tong Gao
bbdedc6c95
[Enhancement] Optimize OpenAI models (#128)
* [Feature] Enhance OpenAI API, add example config for GPT evaluation
2023-08-03 14:55:16 +08:00
Haodong Duan
d17a5b94fa
[Refine] Refine PR #122 (#123)
* update

* update
2023-08-03 14:54:38 +08:00
Yuan Liu
191a3f6f9d
[Feature]: Use multimodal (#73)
* [Feature]: Add minigpt-4

* [Feature]: Add mm local runner

* [Feature]: Add instructblip

* [Feature]: Delete redundant file

* [Feature]: Delete redundant file

* [Feature]: Add README to InstructBLIP

* [Feature]: Update MiniGPT-4

* [Fix]: Fix lint

* [Feature]add omnibenchmark readme (#49)

* add omnibenchmark readme

* fix

* Update OmniMMBench.md

* Update OmniMMBench.md

* Update OmniMMBench.md

* [Fix]: Refine name (#54)

* [Feature]: Unify out and err

* [Fix]: Fix lint

* [Feature]: Rename to mmbench and change weight path

* [Feature]: Delete Omni in instructblip

* [Feature]: Check the avaliablity of lavis

* [Fix]: Fix lint

* [Feature]: Refactor MM

* [Refactor]: Refactor path

* [Feature]: Delete redundant files

* [Refactor]: Delete redundant files

---------

Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>
2023-08-03 11:07:50 +08:00
Tong Gao
8b163bd8e9
[Feature] Several enhancements (#142) 2023-08-01 18:19:49 +08:00
Tong Gao
c00179d46b
[Feature] Evaluating acc based on minimum edit distance, update SIQA (#130)
* [Feature] Support evaluating acc based on minimum edit distance, update SIQA

* update
2023-08-01 14:24:27 +08:00
Leymore
d862f570aa
[Feature] Add SC (#126)
* add self-consistency

* add CoT method Self-Consistency

* fix typo error and update openicl_eval

* add tydiQA-GoldP task

* fix sc

* rename gsm8k_sc

* fix sc

* add self-consistency doc

* refine sc

---------

Authored-by: liushz <qq1791167085@163.com>
2023-07-28 17:29:37 +08:00
Haodong Duan
538b439302
[Fix] Fix seed in HFEvaluator (#122) 2023-07-28 11:29:01 +08:00
Haodong Duan
46c9645753
[Feature] Allow explicitly setting the temperature for API model (#121)
* allow explicitly setting the temperature

* update
2023-07-28 11:28:15 +08:00
gowithme
57fcfc975a
[Feature] Support intern lanuage model (#51)
* support internLM

* support internLM

* simplify intern model files

* update storage_manager

* support internLM

* Modify the file organization structure

* support internLM

* support internLM

* support internLM

* support internLM

* change some details
2023-07-27 18:49:36 +08:00
Hubert
b7184e9db5
[Refactor] Update crows-pairs evaluation (#98)
* [Refactor] Update crows-pairs evaluation

* [Refactor] Update crows-pairs evaluation

* minor
2023-07-26 11:21:32 +08:00
Haonan Li
e9cdb24ddd
[Feature] Add CMMLU dataset (#91)
* add CMMLU

* debug cmmlu

* add slurm args `qos`

* fix format: space before comment

* remove unused variable

* change the location of `answer is`

---------

Co-authored-by: 李浩楠 <lihaonan@lihaonandeMacBook-Air.local>
Co-authored-by: 李浩楠 <haonan.li>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-07-25 10:14:27 +08:00
Haodong Duan
6e885d668b
force utf-8 encoding for all non-dataset fileios (#97) 2023-07-25 10:06:01 +08:00
Leymore
3fe5ee096c
[Feature] Add heuristic size partitioner (#63)
* [Feature] Add heuristic size partitioner

* update
2023-07-20 11:53:24 +08:00
Leymore
eea8b04417
[Feature] Add llama-2 models (#81)
* add llama-2 models

* update docs

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-07-19 19:51:29 +08:00
Hubert
f83e125e5a
[Feat] Support CValues Responsibility dataset (#78)
* [Feat] support CValues

* minor fix
2023-07-18 18:45:15 +08:00
LZH
26e2f171f4
[Feature] Support load PEFT adapter for HuggingFace model (#74)
* support peft for HuggingFace model

* add docstring
2023-07-18 16:21:43 +08:00
liushz
f36c0496f3
[Feature] Add tydiqa-goldp (#75)
Co-authored-by: liuhongwei <liuhongwei@pjlab.org.cn>
2023-07-18 14:54:35 +08:00
Tong Gao
311bf0daa7
[Fix] Fix CI (#70)
* [Fix] Fix CI

* [Fix] Fix CI

* [Fix] Fix CI

* update
2023-07-17 19:10:59 +08:00
Tong Gao
29006e39c0
[Fix] Fix circular import of PromptTemplate (#71) 2023-07-17 19:09:38 +08:00
Tong Gao
1e44541730
[Enhancement] Test linting in CI and fix existing linting errors (#69)
* [Enhancement] Test linting in CI

* fix linting
2023-07-17 15:59:10 +08:00
Leymore
1326aff77e
[Feature] Add logger info and remove dataset bugs (#61)
* Add logger info and remove dataset bugs

* fix typo
2023-07-17 14:26:30 +08:00
Tong Gao
7ee5a86fee
[Feature] Enhance OpenAI API, add example config for GPT evaluation (#53)
* [Feature] Enhance OpenAI API, add example config for GPT evaluation

* fix
2023-07-12 16:43:46 +08:00
Hubert
f5103f93dd
[Feat] add bs for perspective api eval (#50)
* [Feat] add bs for perspective api eval

* fix according to comments

* fix according to comments
2023-07-12 16:26:01 +08:00
Hubert
c8f1d513b2
[Fix] fix clp inferencer (#44) 2023-07-11 14:54:39 +08:00
Tong Gao
0625294e5f
[Fix] Fix OpenICLInferTask (#41) 2023-07-10 16:12:01 +08:00
Ma Zerun
805293a9f2
Auto re-generate port number during retry (#24)
* Auto re-generate port number during retry

* Fix slurm command
2023-07-07 17:25:56 +08:00
Hubert
7f8eee4725
[Docs] add en docs (#15)
* add en docs

* update

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-07-06 12:58:44 +08:00
Leymore
86d5ec3d0f
Update configs (#9)
* Update implements

* Update
2023-07-06 12:27:41 +08:00
Hubert
5c19c8c5fc
[Docs] add issue and pr template (#12)
* [Feat] add issue and pr template

* minor add utils

* minor fix
2023-07-06 11:55:01 +08:00
Tong Gao
986a44cedd
Tranlate lark messages (#5) 2023-07-05 18:40:05 +08:00
Tong Gao
719ba34d1b
[Enhancement] Update prompt hash computation (#2) 2023-07-05 18:29:07 +08:00
Ma Zerun
5840c7655c
Update start guide (#4) 2023-07-05 18:26:26 +08:00
yuzhaohui
dcf11cf8fd New logo and update setup.py 2023-07-05 06:54:06 +00:00
mzr1996
04dd01a235 Update configs and code 2023-07-05 11:45:08 +08:00
Leymore
c94cc94348 Add release contribution 2023-07-05 03:15:31 +00:00
tonysy
e6b5bdcb87 OpenCompass Public MR 2023-07-05 03:15:21 +00:00
Ezra-Yu
cbe9fe2cdb Add Release Contraibution 2023-07-05 02:22:40 +00:00
cky
36f111100f update datasets 2023-07-05 01:45:26 +00:00
mzr1996
3cfe73de3f Support a batch of datasets. 2023-07-05 01:30:27 +00:00
kennymckormick
78478e961e [Code] Update opencompass/datasets/agieval/__init__.py 2023-07-05 00:28:07 +00:00
yingfhu
fb11108723 [Feat] support opencompass 2023-07-04 22:11:33 +08:00
gaotongxiao
7d346000bb initial commit 2023-07-04 21:34:55 +08:00