OpenCompass

mirror of https://github.com/open-compass/opencompass.git synced 2025-05-30 16:03:24 +08:00

History

Mo Li acae560911 Added support for multi-needle testing in needle-in-a-haystack test (#802 ) * Add NeedleInAHaystack Test * Apply pre-commit formatting * Update configs/eval_hf_internlm_chat_20b_cdme.py Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com> * add needle in haystack test * update needle in haystack test * update plot function in tools_needleinahaystack.py * optimizing needleinahaystack dataset generation strategy * modify minor formatting issues * add English version support * change NeedleInAHaystackDataset to dynamic loading * change NeedleInAHaystackDataset to dynamic loading * fix needleinahaystack test eval bug * fix needleinahaystack config bug * Added support for multi-needle testing in needle-in-a-haystack test * Optimize the code for plotting in the needle-in-a-haystack test. * Correct the typo in the dataset parameters. * update needleinahaystack test docs --------- Co-authored-by: Songyang Zhang <tonysy@users.noreply.github.com>		2024-01-17 13:47:34 +08:00
..
circular_eval.md	[Feature] Add circular eval (#610 )	2023-11-23 16:45:47 +08:00
code_eval_service.md	[Docs] Update Docker docs (#718 )	2023-12-19 23:29:43 +08:00
code_eval.md	[Feat] support humaneval and mbpp pass@k (#598 )	2023-11-16 21:22:06 +08:00
contamination_eval.md	[Docs] Update contamination docs (#775 )	2024-01-08 16:37:28 +08:00
custom_dataset.md	[Sync] update configs (#734 )	2023-12-25 21:59:16 +08:00
evaluation_lightllm.md	[Feature] Support Lightllm API (#613 )	2023-11-21 19:18:40 +08:00
evaluation_turbomind.md	[Feature] Update evaluate turbomind (#804 )	2024-01-17 11:09:50 +08:00
longeval.md	[Docs] Readme in longeval (#389 )	2023-09-18 17:06:00 +08:00
multimodal_eval.md	[Docs] Add multimodal docs (#334 )	2023-09-22 18:58:29 +08:00
needleinahaystack_eval.md	Added support for multi-needle testing in needle-in-a-haystack test (#802 )	2024-01-17 13:47:34 +08:00
new_dataset.md	[Docs] update invalid link in docs (#499 )	2023-10-25 13:15:42 +08:00
new_model.md	[Docs] add en docs (#15 )	2023-07-06 12:58:44 +08:00
prompt_attack.md	[Feat] implementation for support promptbench (#239 )	2023-09-15 15:06:53 +08:00
subjective_evaluation.md	[Feature] Add JudgeLLMs (#710 )	2023-12-19 18:40:25 +08:00