Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models
- URL: http://arxiv.org/abs/2510.17098v1
- Date: Mon, 20 Oct 2025 02:04:18 GMT
- Title: Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models
- Authors: Elias Hossain, Swayamjit Saha, Somshubhra Roy, Ravi Prasad,
- Abstract summary: This paper introduces Malicious Token Injection (MTI), a modular framework that perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency.<n> Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference constitutes an overlooked attack surface. This paper introduces Malicious Token Injection (MTI), a modular framework that systematically perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency, using additive Gaussian noise, zeroing, and orthogonal rotations. A theoretical analysis quantifies how these perturbations propagate through attention, linking logit deviations to the Frobenius norm of corruption and softmax Lipschitz dynamics. Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines. These findings identify cache integrity as a critical yet underexplored vulnerability in current LLM deployments, positioning cache corruption as a reproducible and theoretically grounded threat model for future robustness and security research.
Related papers
- Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory [0.0]
This thesis addresses two persistent and closely related challenges in modern deep learning, reliability and efficiency.<n>By analyzing the eigenvalue dynamics of hidden activations across layers and inputs, this work shows that spectral statistics provide a compact, stable, and interpretable lens on model behavior.<n>Within this framework, the first contribution, EigenTrack, introduces a real-time method for detecting hallucinations and out-of-distribution behavior in large language and vision-language models.<n>The second contribution, RMT-KD, presents a principled approach to compressing deep networks via random matrix theoretic knowledge distillation.
arXiv Detail & Related papers (2026-02-25T19:11:56Z) - From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching [7.164841206695704]
We present the first systematic study of integrity risks arising from cache collisions.<n>We introduce CacheAttack, an automated framework for launching black-box collision attacks.<n>A case study on a financial agent illustrates the real-world impact of these vulnerabilities.
arXiv Detail & Related papers (2026-01-30T15:37:00Z) - A Statistical Side-Channel Risk Model for Timing Variability in Lattice-Based Post-Quantum Cryptography [0.0]
Timing side-channels are an important threat to cryptography that still needs to be addressed in implementations.<n> lattice-based schemes may produce secret-dependent timing variability with the help of complex arithmetic and control flow.<n>A scenario-based statistical risk model is proposed for timing leakage as a problem of distributional distinguishability under controlled execution conditions.
arXiv Detail & Related papers (2025-12-26T03:12:33Z) - DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models [50.21378052667732]
We conduct an in-depth analysis of dLLM vulnerabilities to jailbreak attacks across two distinct dimensions: intra-step and inter-step dynamics.<n>We propose DiffuGuard, a training-free defense framework that addresses vulnerabilities through a dual-stage approach.
arXiv Detail & Related papers (2025-09-29T05:17:10Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [74.56971641937771]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations [67.35596444651037]
Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable.<n>We propose a Reliable Test-time Adaptation (ReTA) method that enhances reliability from two perspectives.
arXiv Detail & Related papers (2025-07-13T05:37:33Z) - Dynamic Sparse Causal-Attention Temporal Networks for Interpretable Causality Discovery in Multivariate Time Series [0.4369550829556578]
We introduce Dynamic Sparse Causal-Attention Temporal Networks for Interpretable Causality Discovery in MTS (DyCAST-Net)<n>DyCAST-Net is a novel architecture designed to enhance causal discovery by integrating dilated temporal convolutions and dynamic sparse attention mechanisms.<n>We show that DyCAST-Net consistently outperforms existing models such as TCDF, GCFormer, and CausalFormer.
arXiv Detail & Related papers (2025-07-13T01:03:27Z) - Backdoor Cleaning without External Guidance in MLLM Fine-tuning [76.82121084745785]
Believe Your Eyes (BYE) is a data filtering framework that leverages attention entropy patterns as self-supervised signals to identify and filter backdoor samples.<n>It achieves near-zero attack success rates while maintaining clean-task performance.
arXiv Detail & Related papers (2025-05-22T17:11:58Z) - Robustifying Vision-Language Models via Dynamic Token Reweighting [28.675118345987887]
Large vision-language models (VLMs) are highly vulnerable to jailbreak attacks.<n>We present a novel inference-time defense that mitigates multimodal jailbreak attacks.<n>We introduce a new formulation of the safety-relevant distributional shift induced by the visual modality.
arXiv Detail & Related papers (2025-05-22T03:00:39Z) - Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.<n>We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z) - Advancing Vulnerability Classification with BERT: A Multi-Objective Learning Model [0.0]
This paper presents a novel Vulnerability Report that leverages the BERT (Bi Representations from Transformers) model to perform multi-label classification.<n>The system is deployed via a REST API and a Streamlit UI, enabling real-time vulnerability analysis.
arXiv Detail & Related papers (2025-03-26T06:04:45Z) - Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models [13.157596316463621]
Test-time adaptation (TTA) of visual language models has attracted significant attention as a solution to the performance degradation caused by distribution shifts in downstream tasks.<n>We introduce a comprehensive and reliable caching mechanism and propose a novel zero-shot TTA method called "Cache, Residual, Gaussian" (CRG)<n> Experimental results on 13 benchmarks demonstrate that CRG outperforms state-of-the-art TTA methods, showcasing exceptional robustness and adaptability.
arXiv Detail & Related papers (2025-03-24T04:32:35Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.