A Survey of Attacks on Large Language Models
- URL: http://arxiv.org/abs/2505.12567v1
- Date: Sun, 18 May 2025 22:55:16 GMT
- Title: A Survey of Attacks on Large Language Models
- Authors: Wenrui Xu, Keshab K. Parhi,
- Abstract summary: Large language models (LLMs) and LLM-based agents have been widely deployed in a wide range of applications in the real world.<n>This paper provides a systematic overview of the details of adversarial attacks targeting both LLMs and LLM-based agents.
- Score: 5.845689496906739
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) and LLM-based agents have been widely deployed in a wide range of applications in the real world, including healthcare diagnostics, financial analysis, customer support, robotics, and autonomous driving, expanding their powerful capability of understanding, reasoning, and generating natural languages. However, the wide deployment of LLM-based applications exposes critical security and reliability risks, such as the potential for malicious misuse, privacy leakage, and service disruption that weaken user trust and undermine societal safety. This paper provides a systematic overview of the details of adversarial attacks targeting both LLMs and LLM-based agents. These attacks are organized into three phases in LLMs: Training-Phase Attacks, Inference-Phase Attacks, and Availability & Integrity Attacks. For each phase, we analyze the details of representative and recently introduced attack methods along with their corresponding defenses. We hope our survey will provide a good tutorial and a comprehensive understanding of LLM security, especially for attacks on LLMs. We desire to raise attention to the risks inherent in widely deployed LLM-based applications and highlight the urgent need for robust mitigation strategies for evolving threats.
Related papers
- Large Language Models in Cybersecurity: Applications, Vulnerabilities, and Defense Techniques [11.217261201018815]
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response.<n>With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as IoT, blockchain, and hardware security.
arXiv Detail & Related papers (2025-07-18T03:41:18Z) - Security Concerns for Large Language Models: A Survey [3.175227858236288]
Large Language Models (LLMs) have caused a revolution in natural language processing, but their capabilities also introduce new security vulnerabilities.<n>We provide a comprehensive overview of the emerging security concerns around LLMs, categorizing threats into prompt injection and jailbreaking, adversarial attacks such as input perturbations and data poisoning, and worrisome risks inherent in autonomous LLM agents.<n>We conclude by emphasizing the importance of advancing robust, multi-layered security strategies to ensure LLMs are safe and beneficial.
arXiv Detail & Related papers (2025-05-24T22:22:43Z) - Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs)<n>In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents.<n>We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z) - Emerging Security Challenges of Large Language Models [6.151633954305939]
Large language models (LLMs) have achieved record adoption in a short period of time across many different sectors.<n>They are open-ended models trained on diverse data without being tailored for specific downstream tasks.<n>Traditional Machine Learning (ML) models are vulnerable to adversarial attacks.
arXiv Detail & Related papers (2024-12-23T14:36:37Z) - Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation [4.241100280846233]
AI agents, powered by large language models (LLMs), have transformed human-computer interactions by enabling seamless, natural, and context-aware communication.<n>This paper investigates a critical vulnerability: adversarial attacks targeting the LLM core within AI agents.
arXiv Detail & Related papers (2024-12-05T18:38:30Z) - Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO)
This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z) - A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends [78.3201480023907]
Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across a wide range of multimodal understanding and reasoning tasks.
The vulnerability of LVLMs is relatively underexplored, posing potential security risks in daily usage.
In this paper, we provide a comprehensive review of the various forms of existing LVLM attacks.
arXiv Detail & Related papers (2024-07-10T06:57:58Z) - Purple-teaming LLMs with Adversarial Defender Training [57.535241000787416]
We present Purple-teaming LLMs with Adversarial Defender training (PAD)
PAD is a pipeline designed to safeguard LLMs by novelly incorporating the red-teaming (attack) and blue-teaming (safety training) techniques.
PAD significantly outperforms existing baselines in both finding effective attacks and establishing a robust safe guardrail.
arXiv Detail & Related papers (2024-07-01T23:25:30Z) - Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities [1.0974825157329373]
This paper provides a comprehensive review of the future of cybersecurity through Generative AI and Large Language Models (LLMs)<n>We explore LLM applications across various domains, including hardware design security, intrusion detection, software engineering, design verification, cyber threat intelligence, malware detection, and phishing detection.<n>We present an overview of LLM evolution and its current state, focusing on advancements in models such as GPT-4, GPT-3.5, Mixtral-8x7B, BERT, Falcon2, and LLaMA.
arXiv Detail & Related papers (2024-05-21T13:02:27Z) - A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models [5.077431021127288]
This paper addresses a gap in current research by focusing on security risks posed by large language models (LLMs)
Our work proposes a taxonomy of security risks along the user-model communication pipeline and categorizes the attacks by target and attack type alongside the commonly used confidentiality, integrity, and availability (CIA) triad.
arXiv Detail & Related papers (2023-11-19T20:22:05Z) - Privacy in Large Language Models: Attacks, Defenses and Future Directions [84.73301039987128]
We analyze the current privacy attacks targeting large language models (LLMs) and categorize them according to the adversary's assumed capabilities.
We present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks.
arXiv Detail & Related papers (2023-10-16T13:23:54Z) - Safety Assessment of Chinese Large Language Models [51.83369778259149]
Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes.
To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
arXiv Detail & Related papers (2023-04-20T16:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.