Jingming
7cb53a95fa
[Fix] fix bug on standart_deviation summarizer ( #675 )
2023-12-08 13:38:07 +08:00
liyucheng09
05bbce8b08
[Feature] Add Data Contamination Analysis ( #639 )
...
* add contamination analysis to ceval
* fix bugs
* add contamination docs
* to pass CI check
* update
---------
Co-authored-by: zhangyifan1 <zhangyifan1@pjlab.org.cn>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-12-08 10:00:11 +08:00
bittersweet1999
1c95790fdd
New subjective judgement ( #660 )
...
* TabMWP
* TabMWP
* fixed
* fixed
* fixed
* done
* done
* done
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* add new subjective judgement
* modified to a more general way
* modified to a more general way
* final
* final
* add summarizer
* add new summarize
* fixed
* fixed
* fixed
---------
Co-authored-by: caomaosong <caomaosong@pjlab.org.cn>
2023-12-06 13:28:33 +08:00
Fengzhe Zhou
9083dea683
[Sync] some renaming ( #641 )
2023-11-27 16:06:49 +08:00
Fengzhe Zhou
79f6449d85
[Doc] Update FAQ ( #628 )
...
* update faq
* Update docs/zh_cn/get_started/faq.md
* Update docs/en/get_started/faq.md
* Update docs/zh_cn/get_started/faq.md
---------
Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2023-11-23 18:19:17 +08:00
Fengzhe Zhou
d949e3c003
[Feature] Add circular eval ( #610 )
...
* refactor default, add circular summarizer
* add circular
* update impl
* update doc
* minor update
* no more to be added
2023-11-23 16:45:47 +08:00
Jingming
5e75e29711
[Feature] Add multi-prompt generation demo ( #568 )
...
* [Feature] Add multi-prompt generation demo
* [Fix] change form in winogrande_gen_XXX.py
* [Fix] make multi prompt demo more directly
* [Fix] fix bug
* [Fix] minor fix
---------
Co-authored-by: yingfhu <yingfhu@gmail.com>
2023-11-20 16:16:37 +08:00
Fengzhe Zhou
d3de5c41fb
[Sync] update model configs ( #574 )
2023-11-13 15:15:34 +08:00
Qing
e2355a2ede
[Feature] Add multi model viz ( #509 )
...
* add viz_multi_model.py tool
* Modify the viz_multi_model.py script according to the review
* highlight multiple optimal scores
---------
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
Co-authored-by: Leymore <zfz-960727@163.com>
2023-10-30 12:11:33 +08:00
Fengzhe Zhou
dbb20b8270
[Sync] update ( #517 )
2023-10-27 20:31:22 +08:00
Leymore
fbf5089c40
[Sync] update github token ( #475 )
2023-10-13 06:50:54 -05:00