Related papers: Ghosting Your LLM: Without The Knowledge of Your Gradient and Data

Ghosting Your LLM: Without The Knowledge of Your Gradient and Data

URL: http://arxiv.org/abs/2511.22700v1
Date: Thu, 27 Nov 2025 18:52:51 GMT
Title: Ghosting Your LLM: Without The Knowledge of Your Gradient and Data
Authors: Abeer Matar A. Almalky, Ziyan Wang, Mohaiminul Al Nahian, Li Yang, Adnan Siraj Rakin,
Abstract summary: We focus on the impact of Bit Flip Attacks on large language models (LLMs)<n>BFAs exploits hardware faults to corrupt model parameters, posing a significant threat to model integrity and performance.<n>We propose novel vulnerability index metrics that can identify vulnerable weight bits in LLMs independent of any gradient or data knowledge.
Score: 23.256253959062544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, large language models (LLMs) have achieved substantial advancements and are increasingly integrated into critical applications across various domains. This growing adoption underscores the need to ensure their security and robustness. In this work, we focus on the impact of Bit Flip Attacks (BFAs) on LLMs, which exploits hardware faults to corrupt model parameters, posing a significant threat to model integrity and performance. Existing studies on BFA against LLMs adopt a progressive bit-search strategy that predominantly relies on gradient-based techniques to identify sensitive layers or weights. However, computing gradients comes with two specific challenges: First, in the context of LLMs, it increases computational and memory costs exponentially, and Second, it requires access to a sample victim dataset or knowledge of the victim domain to compute the gradient. In this work, we investigate beyond the scope of attack efficacy and aim to develop an efficient, practical Gradient-Data-free Bit-Flip Attack. The challenge lies in the core principle of adversarial attacks, which relies heavily on computing gradients from sample test/train data and manipulating model weights based on gradient information. To overcome this, we propose novel vulnerability index metrics that can identify vulnerable weight bits in LLMs independent of any gradient or data knowledge. By removing the dependency on gradient computation, our approach drastically reduces memory requirements and scales efficiently across multiple tasks with constant complexity. Experimental results demonstrate the efficiency of our method, requiring as few as a single bit flip to achieve adversarial objectives for five open-source LLMs.

Related papers

Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM [51.21051698747157]
We propose a self-adaptive gradient-aware data selection approach (GrADS) for supervised fine-tuning of large language models (LLMs)<n>Specifically, we design self-guided criteria that leverage the magnitude and statistical distribution of gradients to prioritize examples that contribute the most to the model's learning process.<n>Through extensive experimentation with various LLMs across diverse domains such as medicine, law, and finance, GrADS has demonstrated significant efficiency and cost-effectiveness.
arXiv Detail & Related papers (2025-11-07T08:34:50Z)
Stable Forgetting: Bounded Parameter-Efficient Unlearning in LLMs [30.089412595436585]
We provide a theoretical framework that explains how ascent on the forget set destabilizes optimization in the feedforward layers of large language models (LLMs)<n>We propose Bounded Bounded Unlearning, a parameter-efficient approach that stabilizes fine-tuning by applying bounded functions to adapters.<n>Our method achieves substantial improvements in forgetting while preserving retention, establishing a novel theoretically grounded and practically scalable framework for unlearning in LLMs.
arXiv Detail & Related papers (2025-09-29T01:30:15Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation [12.838593066237452]
Large language models (LLMs) memorize frequently sensitive information during training, posing risks when deploying publicly accessible models.<n>This paper presents our solution to SemEval-2025 Task 4 on targeted unlearning, which combines causal mediation analysis with layer-specific optimization.
arXiv Detail & Related papers (2025-04-17T15:05:40Z)
Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond [41.3029262040131]
We investigate how to make unlearned models robust against relearning attacks.<n>Our analysis reveals that smoothness optimization plays a pivotal role in mitigating relearning attacks.
arXiv Detail & Related papers (2025-02-07T23:03:55Z)
Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning [65.23593936798662]
We show that fine-tuning with LLM-generated data improves target task performance and reduces non-target task degradation.<n>This is the first work to provide an empirical explanation based on token perplexity reduction to mitigate catastrophic forgetting in LLMs after fine-tuning.
arXiv Detail & Related papers (2025-01-24T08:18:56Z)
Curriculum-style Data Augmentation for LLM-based Metaphor Detection [7.4594050203808395]
We propose a method for metaphor detection by fine-tuning open-source LLMs.<n>Our method achieves state-of-the-art performance across all baselines.
arXiv Detail & Related papers (2024-12-04T02:05:21Z)
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models [56.00251589760559]
Large language models (LLMs) can act as gradient priors in a zero-shot setting.<n>We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.<n>Experiments indicate that LM-GC surpasses existing state-of-the-art lossless compression methods.
arXiv Detail & Related papers (2024-09-26T13:38:33Z)
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping [49.66872823080736]
Autoregressive Large Language Models (e.g., LLaMa, GPTs) are omnipresent achieving remarkable success in language understanding and generation. To mitigate overload incurred during generation, several early-exit and layer-dropping strategies have been proposed. We propose FFN-SkipLLM, which is an input-adaptive feed-forward skipping strategy.
arXiv Detail & Related papers (2024-04-05T02:35:43Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.