MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection
- URL: http://arxiv.org/abs/2505.23803v1
- Date: Mon, 26 May 2025 23:27:15 GMT
- Title: MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection
- Authors: Yinuo Xue, Eric Spero, Yun Sing Koh, Giovanni Russello,
- Abstract summary: MultiPhishGuard is a dynamic multi-agent detection system that synergizes specialized expertise with adversarial-aware reinforcement learning.<n>Our framework employs five cooperative agents with automatically adjusted decision weights powered by a Proximal Policy Optimization reinforcement learning algorithm.<n>Experiments demonstrate that MultiPhishGuard achieves high accuracy (97.89%) with low false positive (2.73%) and false negative rates (0.20%)
- Score: 3.187381965457262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phishing email detection faces critical challenges from evolving adversarial tactics and heterogeneous attack patterns. Traditional detection methods, such as rule-based filters and denylists, often struggle to keep pace with these evolving tactics, leading to false negatives and compromised security. While machine learning approaches have improved detection accuracy, they still face challenges adapting to novel phishing strategies. We present MultiPhishGuard, a dynamic LLM-based multi-agent detection system that synergizes specialized expertise with adversarial-aware reinforcement learning. Our framework employs five cooperative agents (text, URL, metadata, explanation simplifier, and adversarial agents) with automatically adjusted decision weights powered by a Proximal Policy Optimization reinforcement learning algorithm. To address emerging threats, we introduce an adversarial training loop featuring an adversarial agent that generates subtle context-aware email variants, creating a self-improving defense ecosystem and enhancing system robustness. Experimental evaluations on public datasets demonstrate that MultiPhishGuard significantly outperforms Chain-of-Thoughts, single-agent baselines and state-of-the-art detectors, as validated by ablation studies and comparative analyses. Experiments demonstrate that MultiPhishGuard achieves high accuracy (97.89\%) with low false positive (2.73\%) and false negative rates (0.20\%). Additionally, we incorporate an explanation simplifier agent, which provides users with clear and easily understandable explanations for why an email is classified as phishing or legitimate. This work advances phishing defense through dynamic multi-agent collaboration and generative adversarial resilience.
Related papers
- Debate-Driven Multi-Agent LLMs for Phishing Email Detection [0.0]
We propose a multi-agent large language model (LLM) prompting technique that simulates deceptive debates among agents to detect phishing emails.<n>Our approach uses two LLM agents to present arguments for or against the classification task, with a judge agent adjudicating the final verdict.<n>Results show that the debate structure itself is sufficient to yield accurate decisions without extra prompting strategies.
arXiv Detail & Related papers (2025-03-27T23:18:14Z) - Efficient Phishing URL Detection Using Graph-based Machine Learning and Loopy Belief Propagation [12.89058029173131]
We propose a graph-based machine learning model for phishing URL detection.<n>We integrate URL structure and network-level features such as IP addresses and authoritative name servers.<n>Experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77%.
arXiv Detail & Related papers (2025-01-12T19:49:00Z) - PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection [26.106113544525545]
Phishing attacks are a major threat to online security, exploiting user vulnerabilities to steal sensitive information.<n>Various methods have been developed to counteract phishing, each with varying levels of accuracy, but they also face notable limitations.<n>In this study, we introduce PhishAgent, a multimodal agent that combines a wide range of tools, integrating both online and offline knowledge bases with Multimodal Large Language Models (MLLMs)<n>This combination leads to broader brand coverage, which enhances brand recognition and recall.
arXiv Detail & Related papers (2024-08-20T11:14:21Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Deep Learning-Based Speech and Vision Synthesis to Improve Phishing
Attack Detection through a Multi-layer Adaptive Framework [1.3353802999735709]
Current anti-phishing methods remain vulnerable to complex phishing because of the increasingly sophistication tactics adopted by attacker.
In this research, we proposed a framework that combines Deep learning and Randon Forest to read images, synthesize speech from deep-fake videos, and natural language processing.
arXiv Detail & Related papers (2024-02-27T06:47:52Z) - AntiPhishStack: LSTM-based Stacked Generalization Model for Optimized
Phishing URL Detection [0.32141666878560626]
This paper introduces a two-phase stack generalized model named AntiPhishStack, designed to detect phishing sites.
The model leverages the learning of URLs and character-level TF-IDF features symmetrically, enhancing its ability to combat emerging phishing threats.
Experimental validation on two benchmark datasets, comprising benign and phishing or malicious URLs, demonstrates the model's exceptional performance, achieving a notable 96.04% accuracy compared to existing studies.
arXiv Detail & Related papers (2024-01-17T03:44:27Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.