OpenCompass/docs/en/advanced_guides
Mo Li acae560911
Added support for multi-needle testing in needle-in-a-haystack test (#802)
* Add NeedleInAHaystack Test

* Apply pre-commit formatting

* Update configs/eval_hf_internlm_chat_20b_cdme.py

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>

* add needle in haystack test

* update needle in haystack test

* update plot function in tools_needleinahaystack.py

* optimizing needleinahaystack dataset generation strategy

* modify minor formatting issues

* add English version support

* change NeedleInAHaystackDataset to dynamic loading

* change NeedleInAHaystackDataset to dynamic loading

* fix needleinahaystack test eval bug

* fix needleinahaystack config bug

* Added support for multi-needle testing in needle-in-a-haystack test

* Optimize the code for plotting in the needle-in-a-haystack test.

* Correct the typo in the dataset parameters.

* update needleinahaystack test docs

---------

Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>
2024-01-17 13:47:34 +08:00
..
circular_eval.md [Feature] Add circular eval (#610) 2023-11-23 16:45:47 +08:00
code_eval_service.md [Docs] Update Docker docs (#718) 2023-12-19 23:29:43 +08:00
code_eval.md [Feat] support humaneval and mbpp pass@k (#598) 2023-11-16 21:22:06 +08:00
contamination_eval.md [Docs] Update contamination docs (#775) 2024-01-08 16:37:28 +08:00
custom_dataset.md [Sync] update configs (#734) 2023-12-25 21:59:16 +08:00
evaluation_lightllm.md [Feature] Support Lightllm API (#613) 2023-11-21 19:18:40 +08:00
evaluation_turbomind.md [Feature] Update evaluate turbomind (#804) 2024-01-17 11:09:50 +08:00
longeval.md [Docs] Readme in longeval (#389) 2023-09-18 17:06:00 +08:00
multimodal_eval.md [Docs] Add multimodal docs (#334) 2023-09-22 18:58:29 +08:00
needleinahaystack_eval.md Added support for multi-needle testing in needle-in-a-haystack test (#802) 2024-01-17 13:47:34 +08:00
new_dataset.md [Docs] update invalid link in docs (#499) 2023-10-25 13:15:42 +08:00
new_model.md [Docs] add en docs (#15) 2023-07-06 12:58:44 +08:00
prompt_attack.md [Feat] implementation for support promptbench (#239) 2023-09-15 15:06:53 +08:00
subjective_evaluation.md [Feature] Add JudgeLLMs (#710) 2023-12-19 18:40:25 +08:00