Regular Expression Denial of Service Induced by Backreferences
- URL: http://arxiv.org/abs/2602.21459v1
- Date: Wed, 25 Feb 2026 00:23:50 GMT
- Title: Regular Expression Denial of Service Induced by Backreferences
- Authors: Yichen Liu, Berk Çakar, Aman Agrawal, Minseok Seo, James C. Davis, Dongyoon Lee,
- Abstract summary: This paper presents the first systematic study of denial-of-service vulnerabilities in Regular Expressions with Backreferences (REwB)<n>Using the Two-Phase Memory Automaton (2PMFA), we derive necessary conditions under which backreferences induce super-linear backtracking runtime.<n>We identify three vulnerability patterns, develop detection and attack-construction algorithms, and validate them in practice.
- Score: 13.04731556594332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the first systematic study of denial-of-service vulnerabilities in Regular Expressions with Backreferences (REwB). We introduce the Two-Phase Memory Automaton (2PMFA), an automaton model that precisely captures REwB semantics. Using this model, we derive necessary conditions under which backreferences induce super-linear backtracking runtime, even when sink ambiguity is linear -- a regime where existing detectors report no vulnerability. Based on these conditions, we identify three vulnerability patterns, develop detection and attack-construction algorithms, and validate them in practice. Using the Snort intrusion detection ruleset, our evaluation identifies 45 previously unknown REwB vulnerabilities with quadratic or worse runtime. We further demonstrate practical exploits against Snort, including slowing rule evaluation by 0.6-1.2 seconds and bypassing alerts by triggering PCRE's matching limit.
Related papers
- The Compliance Paradox: Semantic-Instruction Decoupling in Automated Academic Code Evaluation [11.984098021215878]
We introduce the Semantic-Preserving Adrial Code Injection (SPACI) Framework and the Abstract Syntax Tree-Aware Semantic Injection Protocol (AST-ASIP)<n>These methods exploit the Syntax-Semantics Gap by embedding adversarial directives into syntactically inert regions (trivia nodes) of the Abstract Syntax Tree.<n>Through a large-scale evaluation of 9 SOTA models across 25,000 submissions in Python, C, C++, and Java, we reveal catastrophic failure rates (>95%) in high-capacity open-weights models like DeepSeek-V3.
arXiv Detail & Related papers (2026-01-29T07:40:58Z) - Defenses Against Prompt Attacks Learn Surface Heuristics [40.392588465939106]
Large language models (LLMs) are increasingly deployed in security-sensitive applications.<n>LLMs may override intended logic when adversarial instructions appear in user queries or retrieved content.<n>Recent defenses rely on supervised fine-tuning with benign and malicious labels.
arXiv Detail & Related papers (2026-01-12T04:12:48Z) - The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z) - RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation [0.0]
This paper presents an automated jailbreak attack that converts a disallowed user query into a self reconstructing prompt.<n>We instantiate RoguePrompt against GPT 4o and evaluate it on 2 448 prompts that a production moderation system previously marked as strongly rejected.<n>Under an evaluation protocol that separates three security relevant outcomes bypass, reconstruction, and execution the attack attains 84.7 percent bypass, 80.2 percent reconstruction, and 71.5 percent full execution.
arXiv Detail & Related papers (2025-11-24T05:42:54Z) - Backdoor Collapse: Eliminating Unknown Threats via Known Backdoor Aggregation in Language Models [75.29749026964154]
Ourmethod reduces the average Attack Success Rate to 4.41% across multiple benchmarks.<n>Clean accuracy and utility are preserved within 0.5% of the original model.<n>The defense generalizes across different types of backdoors, confirming its robustness in practical deployment scenarios.
arXiv Detail & Related papers (2025-10-11T15:47:35Z) - On the Adversarial Robustness of Learning-based Conformal Novelty Detection [10.58528988397402]
We study the adversarial robustness of conformal novelty detection using AdaDetect.<n>Our results show that adversarial perturbations can significantly increase the FDR while maintaining high detection power.
arXiv Detail & Related papers (2025-10-01T03:29:11Z) - DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models [50.21378052667732]
We conduct an in-depth analysis of dLLM vulnerabilities to jailbreak attacks across two distinct dimensions: intra-step and inter-step dynamics.<n>We propose DiffuGuard, a training-free defense framework that addresses vulnerabilities through a dual-stage approach.
arXiv Detail & Related papers (2025-09-29T05:17:10Z) - OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models [91.55634905861827]
Over-refusal is a phenomenon known as $textitover-refusal$ that reduces the practical utility of T2I models.<n>We present OVERT ($textbfOVE$r-$textbfR$efusal evaluation on $textbfT$ext-to-image models), the first large-scale benchmark for assessing over-refusal behaviors.
arXiv Detail & Related papers (2025-05-27T15:42:46Z) - Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world personalized applications.<n>The valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Spatial-Frequency Discriminability for Revealing Adversarial Perturbations [53.279716307171604]
Vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community.
Current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data.
We propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition.
arXiv Detail & Related papers (2023-05-18T10:18:59Z) - AntidoteRT: Run-time Detection and Correction of Poison Attacks on
Neural Networks [18.461079157949698]
backdoor poisoning attacks against image classification networks.
We propose lightweight automated detection and correction techniques against poisoning attacks.
Our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks.
arXiv Detail & Related papers (2022-01-31T23:42:32Z) - No Need to Know Physics: Resilience of Process-based Model-free Anomaly
Detection for Industrial Control Systems [95.54151664013011]
We present a novel framework to generate adversarial spoofing signals that violate physical properties of the system.
We analyze four anomaly detectors published at top security conferences.
arXiv Detail & Related papers (2020-12-07T11:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.