GenBFA: An Evolutionary Optimization Approach to Bit-Flip Attacks on LLMs
- URL: http://arxiv.org/abs/2411.13757v2
- Date: Fri, 07 Feb 2025 16:24:17 GMT
- Title: GenBFA: An Evolutionary Optimization Approach to Bit-Flip Attacks on LLMs
- Authors: Sanjay Das, Swastik Bhattacharya, Souvik Kundu, Shamik Kundu, Anand Menon, Arnab Raha, Kanad Basu,
- Abstract summary: Large Language Models (LLMs) have revolutionized natural language processing (NLP)
Increasing adoption in mission-critical applications raises concerns about hardware-based threats, particularly bit-flip attacks (BFAs)
- Score: 3.967858172081495
- License:
- Abstract: Large Language Models (LLMs) have revolutionized natural language processing (NLP), excelling in tasks like text generation and summarization. However, their increasing adoption in mission-critical applications raises concerns about hardware-based threats, particularly bit-flip attacks (BFAs). BFAs, enabled by fault injection methods such as Rowhammer, target model parameters in memory, compromising both integrity and performance. Identifying critical parameters for BFAs in the vast parameter space of LLMs poses significant challenges. While prior research suggests transformer-based architectures are inherently more robust to BFAs compared to traditional deep neural networks, we challenge this assumption. For the first time, we demonstrate that as few as three bit-flips can cause catastrophic performance degradation in an LLM with billions of parameters. Current BFA techniques are inadequate for exploiting this vulnerability due to the difficulty of efficiently identifying critical parameters within the immense parameter space. To address this, we propose AttentionBreaker, a novel framework tailored for LLMs that enables efficient traversal of the parameter space to identify critical parameters. Additionally, we introduce GenBFA, an evolutionary optimization strategy designed to refine the search further, isolating the most critical bits for an efficient and effective attack. Empirical results reveal the profound vulnerability of LLMs to AttentionBreaker. For example, merely three bit-flips (4.129 x 10^-9% of total parameters) in the LLaMA3-8B-Instruct 8-bit quantized (W8) model result in a complete performance collapse: accuracy on MMLU tasks drops from 67.3% to 0%, and Wikitext perplexity skyrockets from 12.6 to 4.72 x 10^5. These findings underscore the effectiveness of AttentionBreaker in uncovering and exploiting critical vulnerabilities within LLM architectures.
Related papers
- Adaptive Pruning for Large Language Models with Structural Importance Awareness [66.2690963378878]
Large language models (LLMs) have significantly improved language understanding and generation capabilities.
LLMs are difficult to deploy on resource-constrained edge devices due to their high computational and storage resource demands.
We propose structurally-aware adaptive pruning (SAAP) to significantly reduce the computational and memory costs while maintaining model performance.
arXiv Detail & Related papers (2024-12-19T18:08:04Z) - Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation [2.3080718283523827]
Large language models (LLMs) have exhibited outstanding performance in natural language processing tasks.
adversarial attacks in which slight input perturbations can lead to harmful or misleading outputs.
gradient-based defensive suffix generation algorithm is designed to bolster the robustness of LLMs.
arXiv Detail & Related papers (2024-12-18T10:49:41Z) - MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference [0.8388591755871735]
Homomorphic Encryption (HE) enables us to perform machine learning tasks over encrypted data.
We propose MOFHEI, a framework that optimize the model to make HE-based neural network inference, fast and efficient.
Our framework achieves up to 98% pruning ratio on LeNet, eliminating up to 93% of the required HE operations for performing PI.
arXiv Detail & Related papers (2024-12-10T22:44:54Z) - LLM-BIP: Structured Pruning for Large Language Models with Block-Wise Forward Importance Propagation [0.0]
We propose a more accurate pruning metric based on the block-wise importance score propagation.
We evaluate the proposed method using LLaMA-7B, Vicuna-7B, and LLaMA-13B across common zero-shot tasks.
arXiv Detail & Related papers (2024-12-09T11:57:16Z) - Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities [63.603861880022954]
We introduce ADV-LLM, an iterative self-tuning process that crafts adversarial LLMs with enhanced jailbreak ability.
Our framework significantly reduces the computational cost of generating adversarial suffixes while achieving nearly 100% ASR on various open-source LLMs.
It exhibits strong attack transferability to closed-source models, achieving 99% ASR on GPT-3.5 and 49% ASR on GPT-4, despite being optimized solely on Llama3.
arXiv Detail & Related papers (2024-10-24T06:36:12Z) - Semantic-guided Search for Efficient Program Repair with Large Language Models [0.9319432628663639]
FLAMES employs semantic-guided patch generation to enhance repair effectiveness and memory efficiency.
FLAMES substantially reduces memory consumption by up to 83% compared to conventional LLM-based APR.
Remarkably, FLAMES successfully generated 133 and 103 correct fixes for 333 and 163 bugs in the Defects4J and HumanEval-Java datasets.
arXiv Detail & Related papers (2024-10-22T02:59:47Z) - Search for Efficient Large Language Models [52.98684997131108]
Large Language Models (LLMs) have long held sway in the realms of artificial intelligence research.
Weight pruning, quantization, and distillation have been embraced to compress LLMs, targeting memory reduction and inference acceleration.
Most model compression techniques concentrate on weight optimization, overlooking the exploration of optimal architectures.
arXiv Detail & Related papers (2024-09-25T21:32:12Z) - AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models [95.09157454599605]
Large Language Models (LLMs) are becoming increasingly powerful, but they still exhibit significant but subtle weaknesses.
Traditional benchmarking approaches cannot thoroughly pinpoint specific model deficiencies.
We introduce a unified framework, AutoDetect, to automatically expose weaknesses in LLMs across various tasks.
arXiv Detail & Related papers (2024-06-24T15:16:45Z) - Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models.
We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z) - FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping [49.66872823080736]
Autoregressive Large Language Models (e.g., LLaMa, GPTs) are omnipresent achieving remarkable success in language understanding and generation.
To mitigate overload incurred during generation, several early-exit and layer-dropping strategies have been proposed.
We propose FFN-SkipLLM, which is an input-adaptive feed-forward skipping strategy.
arXiv Detail & Related papers (2024-04-05T02:35:43Z) - Evaluation of Parameter-based Attacks against Embedded Neural Networks
with Laser Injection [1.2499537119440245]
This work practically reports, for the first time, a successful variant of the Bit-Flip Attack, BFA, on a 32-bit Cortex-M microcontroller using laser fault injection.
To avoid unrealistic brute-force strategies, we show how simulations help selecting the most sensitive set of bits from the parameters taking into account the laser fault model.
arXiv Detail & Related papers (2023-04-25T14:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.