Related papers: LongSafety: Enhance Safety for Long-Context LLMs

LongSafety: Enhance Safety for Long-Context LLMs

URL: http://arxiv.org/abs/2411.06899v2
Date: Thu, 27 Feb 2025 13:08:46 GMT
Title: LongSafety: Enhance Safety for Long-Context LLMs
Authors: Mianqiu Huang, Xiaoran Liu, Shaojun Zhou, Mozhi Zhang, Qipeng Guo, Linyang Li, Chenkun Tan, Yang Gao, Pengyu Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xipeng Qiu, Xuanjing Huang,
Abstract summary: We introduce textbfLongSafety, a safety alignment dataset for long-context language models (LLMs)<n>Our experiments demonstrate that training with LongSafety can enhance long-context safety performance while enhancing short-context safety and preserving general capabilities.
Score: 85.52121220707822
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in model architectures and length extrapolation techniques have significantly extended the context length of large language models (LLMs), paving the way for their application in increasingly complex tasks. However, despite the growing capabilities of long-context LLMs, the safety issues in long-context scenarios remain underexplored. While safety alignment in short context has been widely studied, the safety concerns of long-context LLMs have not been adequately addressed. In this work, we introduce \textbf{LongSafety}, a comprehensive safety alignment dataset for long-context LLMs, containing 10 tasks and 17k samples, with an average length of 40.9k tokens. Our experiments demonstrate that training with LongSafety can enhance long-context safety performance while enhancing short-context safety and preserving general capabilities. Furthermore, we demonstrate that long-context safety does not equal long-context alignment with short-context safety data and LongSafety has generalizing capabilities in context length and long-context safety scenarios.

Related papers

Flora: Effortless Context Construction to Arbitrary Length and Scale [71.12886910497284]
We introduce Flora, an effortless (human/LLM-free) long-context construction strategy.<n>Experiments on Llama3-8B-Instruct and QwQ-32B show that Flora excel in three long-context benchmarks while maintaining strong performances in short-context tasks.
arXiv Detail & Related papers (2025-07-26T04:21:21Z)
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs [19.604065692511416]
We investigate long-context vulnerabilities in Large Language Models (LLMs) through Many-Shot Jailbreaking (MSJ)<n>Our experiments utilize context length of up to 128K tokens.<n>We find that successful attacks do not require carefully crafted harmful content.
arXiv Detail & Related papers (2025-05-26T09:57:25Z)
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation [15.975325252309554]
We introduce a novel post-training synthetic data generation strategy designed to efficiently extend the context window of Large Language Models. Our approach scalably extends to arbitrarily long context lengths, unconstrained by the length of available real-world data. We demonstrate that our model, with a context length of up to 1M tokens, performs well on the RULER benchmark and InfiniteBench.
arXiv Detail & Related papers (2025-04-17T04:46:57Z)
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation [57.310236384112834]
In-context learning (ICL) is critical for large language models (LLMs) but its effectiveness is constrained by finite context windows. We introduce InfiniteICL, a framework that parallels context and parameters in LLMs with short- and long-term memory. We demonstrate that our method reduces context length by 90% while achieving 103% average performance of full-context prompting.
arXiv Detail & Related papers (2025-04-02T13:15:44Z)
LongSafety: Evaluating Long-Context Safety of Large Language Models [95.2469116388522]
LongSafety is the first benchmark designed to evaluate safety in open-ended long-context tasks. Our evaluation reveals significant safety vulnerabilities, with most models achieving safety rates below 55%. Our findings emphasize the unique challenges and urgency of improving long-context safety.
arXiv Detail & Related papers (2025-02-24T08:54:39Z)
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization [49.37607974207405]
LongPO harnesses short-to-long preference data to transfer short-context capabilities to long-context tasks. LongPO fully retains short-context performance and largely outperforms naive SFT and DPO in both long- and short-context tasks.
arXiv Detail & Related papers (2025-02-19T17:59:03Z)
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning [35.31849814789343]
This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling. LIFT enables efficient processing of lengthy inputs without the computational burden of offline long-context adaptation. The framework is further enhanced by integrating in-context learning and pre-LIFT supervised fine-tuning.
arXiv Detail & Related papers (2024-12-18T09:04:55Z)
How Effective Is Self-Consistency for Long-Context Problems? [18.633918831942434]
Self-consistency (SC) has been demonstrated to enhance the performance of large language models (LLMs) This study examines the role of SC in long-context scenarios, where LLMs often struggle with position bias.
arXiv Detail & Related papers (2024-11-02T01:52:42Z)
LongReward: Improving Long-context Large Language Models with AI Feedback [54.3321542678909]
LongReward is a novel method that provides rewards for long-context model responses from four human-valued dimensions. Our experiments indicate that LongReward not only significantly improves models' long-context performance but also enhances their ability to follow short instructions.
arXiv Detail & Related papers (2024-10-28T17:50:42Z)
Multimodal Situational Safety [73.63981779844916]
We present the first evaluation and analysis of a novel safety challenge termed Multimodal Situational Safety. For an MLLM to respond safely, whether through language or action, it often needs to assess the safety implications of a language query within its corresponding visual context. We develop the Multimodal Situational Safety benchmark (MSSBench) to assess the situational safety performance of current MLLMs.
arXiv Detail & Related papers (2024-10-08T16:16:07Z)
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA [71.04146366608904]
Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows. We propose a novel long-context benchmark, Loong, aligning with realistic scenarios through extended multi-document question answering (QA) Loong introduces four types of tasks with a range of context lengths: Spotlight Locating, Comparison, Clustering, and Chain of Reasoning.
arXiv Detail & Related papers (2024-06-25T09:42:56Z)
Exploring Advanced Methodologies in Security Evaluation for LLMs [16.753146059652877]
Large Language Models (LLMs) represent an advanced evolution of earlier, simpler language models. They boast enhanced abilities to handle complex language patterns and generate coherent text, images, audios, and videos. Rapid expansion of LLMs has raised security and ethical concerns within the academic community.
arXiv Detail & Related papers (2024-02-28T01:32:58Z)
Training With "Paraphrasing the Original Text" Improves Long-Context Performance [19.48556587305737]
Large Language Models (LLMs) continue to evolve, more are being designed to handle long-context inputs. We propose a novel approach to design training data for long-context tasks, aiming at augmenting LLMs' proficiency in extracting key information from long context. Experimenting on LongBench and NaturalQuestions Multi-document-QA dataset with models of Llama and Qwen series, our method achieves an improvement of up to 8.48% and 4.48% in average scores.
arXiv Detail & Related papers (2023-12-18T13:40:16Z)
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models [141.21603469555225]
Large language models (LLMs) have achieved dramatic proficiency over NLP tasks with normal length. We propose BAMBOO, a multi-task long context benchmark. It consists of 10 datasets from 5 different long text understanding tasks.
arXiv Detail & Related papers (2023-09-23T11:36:15Z)
SafetyBench: Evaluating the Safety of Large Language Models [54.878612385780805]
SafetyBench is a comprehensive benchmark for evaluating the safety of Large Language Models (LLMs) It comprises 11,435 diverse multiple choice questions spanning across 7 distinct categories of safety concerns. Our tests over 25 popular Chinese and English LLMs in both zero-shot and few-shot settings reveal a substantial performance advantage for GPT-4 over its counterparts.
arXiv Detail & Related papers (2023-09-13T15:56:50Z)
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding [58.20031627237889]
LongBench is the first bilingual, multi-task benchmark for long context understanding. It comprises 21 datasets across 6 task categories in both English and Chinese, with an average length of 6,711 words (English) and 13,386 characters (Chinese)
arXiv Detail & Related papers (2023-08-28T11:53:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.