OpenCompass/tools
Mo Li acae560911
Added support for multi-needle testing in needle-in-a-haystack test (#802)
* Add NeedleInAHaystack Test

* Apply pre-commit formatting

* Update configs/eval_hf_internlm_chat_20b_cdme.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* add needle in haystack test

* update needle in haystack test

* update plot function in tools_needleinahaystack.py

* optimizing needleinahaystack dataset generation strategy

* modify minor formatting issues

* add English version support

* change NeedleInAHaystackDataset to dynamic loading

* change NeedleInAHaystackDataset to dynamic loading

* fix needleinahaystack test eval bug

* fix needleinahaystack config bug

* Added support for multi-needle testing in needle-in-a-haystack test

* Optimize the code for plotting in the needle-in-a-haystack test.

* Correct the typo in the dataset parameters.

* update needleinahaystack test docs

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-17 13:47:34 +08:00
..
case_analyzer.py [Docs] update descriptions for tools (#270) 2023-08-25 16:00:26 +08:00
collect_code_preds.py [Feat] support wizardcoder series (#344) 2023-09-06 17:52:35 +08:00
convert_alignmentbench.py [Fix] Fix small bug in alignbench (#764) 2024-01-03 07:44:53 +00:00
eval_mmbench.py [Script] Add scripts to evaluate MMBench (#161) 2023-08-07 16:53:36 +08:00
list_configs.py [Feature] Simplify entry script (#204) 2023-08-25 17:36:30 +08:00
prediction_merger.py [Docs] update descriptions for tools (#270) 2023-08-25 16:00:26 +08:00
prompt_viewer.py [Sync] some renaming (#641) 2023-11-27 16:06:49 +08:00
test_api_model.py initial commit 2023-07-04 21:34:55 +08:00
tools_needleinahaystack.py Added support for multi-needle testing in needle-in-a-haystack test (#802) 2024-01-17 13:47:34 +08:00
update_dataset_suffix.py [Sync] Sync with internal codes 2023.01.08 (#777) 2024-01-08 14:07:24 +00:00
viz_multi_model.py [Feature] Add multi model viz (#509) 2023-10-30 12:11:33 +08:00