MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection
- URL: http://arxiv.org/abs/2602.21394v2
- Date: Thu, 26 Feb 2026 02:32:50 GMT
- Title: MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection
- Authors: Xuan Chen, Hao Liu, Tao Yuan, Mehran Kafai, Piotr Habas, Xiangyu Zhang,
- Abstract summary: MemoPhishAgent (MPA) is a memory-augmented multi-modal LLM agent that orchestrates phishing-specific tools.<n>MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6%.<n>MPA is deployed in production, processing 60K targeted high-risk URLs weekly, and achieving 91.44% recall.
- Score: 15.810091292280584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional phishing website detection relies on static heuristics or reference lists, which lag behind rapidly evolving attacks. While recent systems incorporate large language models (LLMs), they are still prompt-based, deterministic pipelines that underutilize reasoning capability. We present MemoPhishAgent (MPA), a memory-augmented multi-modal LLM agent that dynamically orchestrates phishing-specific tools and leverages episodic memories of past reasoning trajectories to guide decisions on recurring and novel threats. On two public datasets, MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6%. To better reflect realistic, user-facing phishing detection performance, we further evaluate MPA on a benchmark of real-world suspicious URLs actively crawled from five social media platforms, where it improves recall by 20%. Detailed analysis shows episodic memory contributes up to 27% recall gain without introducing additional computational overhead. The ablation study confirms the necessity of the agent-based approach compared to prompt-based baselines and validates the effectiveness of our tool design. Finally, MPA is deployed in production, processing 60K targeted high-risk URLs weekly, and achieving 91.44% recall, providing proactive protection for millions of customers. Together, our results show that combining multi-modal reasoning with episodic memory yields robust phishing detection in realistic user-exposure settings.
Related papers
- Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection [0.7391823486666542]
Large language models (LLMs) are vulnerable to prompt injection (PI)<n>This paper presents the first comprehensive evaluation of PI against multimodal LLM-based phishing detection.<n>We propose InjectDefuser, a defense framework that combines prompt hardening, allowlist-based retrieval augmentation, and output validation.
arXiv Detail & Related papers (2026-02-05T09:44:20Z) - ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models [67.15960154375131]
Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces.<n>This capability introduces a new class of prompt-induced inference-time denial-of-service (PI-DoS) attacks that exploit the high computational cost of reasoning.<n>We present ReasoningBomb, a reinforcement-learning-based PI-DoS framework that is guided by a constant-time surrogate reward.
arXiv Detail & Related papers (2026-01-29T18:53:01Z) - The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z) - CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection [0.8737375836744933]
We present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents.<n>The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content.<n>CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset.
arXiv Detail & Related papers (2025-10-21T12:38:52Z) - PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation [13.177607247367211]
We propose PhishIntentionLLM, a framework that uncovers phishing intentions from website screenshots.<n>Our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting.<n>We generate a larger dataset of 9K samples for large-scale phishing intention profiling across sectors.
arXiv Detail & Related papers (2025-07-21T09:20:43Z) - MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection [3.187381965457262]
MultiPhishGuard is a dynamic multi-agent detection system that synergizes specialized expertise with adversarial-aware reinforcement learning.<n>Our framework employs five cooperative agents with automatically adjusted decision weights powered by a Proximal Policy Optimization reinforcement learning algorithm.<n>Experiments demonstrate that MultiPhishGuard achieves high accuracy (97.89%) with low false positive (2.73%) and false negative rates (0.20%)
arXiv Detail & Related papers (2025-05-26T23:27:15Z) - AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z) - Neural Antidote: Class-Wise Prompt Tuning for Purifying Backdoors in CLIP [51.04452017089568]
Class-wise Backdoor Prompt Tuning (CBPT) is an efficient and effective defense mechanism that operates on text prompts to indirectly purify CLIP.<n>CBPT significantly mitigates backdoor threats while preserving model utility.
arXiv Detail & Related papers (2025-02-26T16:25:15Z) - PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection [26.106113544525545]
Phishing attacks are a major threat to online security, exploiting user vulnerabilities to steal sensitive information.<n>Various methods have been developed to counteract phishing, each with varying levels of accuracy, but they also face notable limitations.<n>In this study, we introduce PhishAgent, a multimodal agent that combines a wide range of tools, integrating both online and offline knowledge bases with Multimodal Large Language Models (MLLMs)<n>This combination leads to broader brand coverage, which enhances brand recognition and recall.
arXiv Detail & Related papers (2024-08-20T11:14:21Z) - AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base.
Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning.
On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.