Related papers: MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection

MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection

URL: http://arxiv.org/abs/2602.21394v2
Date: Thu, 26 Feb 2026 02:32:50 GMT
Title: MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection
Authors: Xuan Chen, Hao Liu, Tao Yuan, Mehran Kafai, Piotr Habas, Xiangyu Zhang,
Abstract summary: MemoPhishAgent (MPA) is a memory-augmented multi-modal LLM agent that orchestrates phishing-specific tools.<n>MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6%.<n>MPA is deployed in production, processing 60K targeted high-risk URLs weekly, and achieving 91.44% recall.
Score: 15.810091292280584
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traditional phishing website detection relies on static heuristics or reference lists, which lag behind rapidly evolving attacks. While recent systems incorporate large language models (LLMs), they are still prompt-based, deterministic pipelines that underutilize reasoning capability. We present MemoPhishAgent (MPA), a memory-augmented multi-modal LLM agent that dynamically orchestrates phishing-specific tools and leverages episodic memories of past reasoning trajectories to guide decisions on recurring and novel threats. On two public datasets, MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6%. To better reflect realistic, user-facing phishing detection performance, we further evaluate MPA on a benchmark of real-world suspicious URLs actively crawled from five social media platforms, where it improves recall by 20%. Detailed analysis shows episodic memory contributes up to 27% recall gain without introducing additional computational overhead. The ablation study confirms the necessity of the agent-based approach compared to prompt-based baselines and validates the effectiveness of our tool design. Finally, MPA is deployed in production, processing 60K targeted high-risk URLs weekly, and achieving 91.44% recall, providing proactive protection for millions of customers. Together, our results show that combining multi-modal reasoning with episodic memory yields robust phishing detection in realistic user-exposure settings.

Related papers

Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection [0.7391823486666542]
Large language models (LLMs) are vulnerable to prompt injection (PI)<n>This paper presents the first comprehensive evaluation of PI against multimodal LLM-based phishing detection.<n>We propose InjectDefuser, a defense framework that combines prompt hardening, allowlist-based retrieval augmentation, and output validation.
arXiv Detail & Related papers (2026-02-05T09:44:20Z)
ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models [67.15960154375131]
Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces.<n>This capability introduces a new class of prompt-induced inference-time denial-of-service (PI-DoS) attacks that exploit the high computational cost of reasoning.<n>We present ReasoningBomb, a reinforcement-learning-based PI-DoS framework that is guided by a constant-time surrogate reward.
arXiv Detail & Related papers (2026-01-29T18:53:01Z)
The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z)
CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection [0.8737375836744933]
We present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents.<n>The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content.<n>CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset.
arXiv Detail & Related papers (2025-10-21T12:38:52Z)
PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation [13.177607247367211]
We propose PhishIntentionLLM, a framework that uncovers phishing intentions from website screenshots.<n>Our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting.<n>We generate a larger dataset of 9K samples for large-scale phishing intention profiling across sectors.
arXiv Detail & Related papers (2025-07-21T09:20:43Z)
MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection [3.187381965457262]
MultiPhishGuard is a dynamic multi-agent detection system that synergizes specialized expertise with adversarial-aware reinforcement learning.<n>Our framework employs five cooperative agents with automatically adjusted decision weights powered by a Proximal Policy Optimization reinforcement learning algorithm.<n>Experiments demonstrate that MultiPhishGuard achieves high accuracy (97.89%) with low false positive (2.73%) and false negative rates (0.20%)
arXiv Detail & Related papers (2025-05-26T23:27:15Z)
AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z)
Neural Antidote: Class-Wise Prompt Tuning for Purifying Backdoors in CLIP [51.04452017089568]
Class-wise Backdoor Prompt Tuning (CBPT) is an efficient and effective defense mechanism that operates on text prompts to indirectly purify CLIP.<n>CBPT significantly mitigates backdoor threats while preserving model utility.
arXiv Detail & Related papers (2025-02-26T16:25:15Z)
PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection [26.106113544525545]
Phishing attacks are a major threat to online security, exploiting user vulnerabilities to steal sensitive information.<n>Various methods have been developed to counteract phishing, each with varying levels of accuracy, but they also face notable limitations.<n>In this study, we introduce PhishAgent, a multimodal agent that combines a wide range of tools, integrating both online and offline knowledge bases with Multimodal Large Language Models (MLLMs)<n>This combination leads to broader brand coverage, which enhances brand recognition and recall.
arXiv Detail & Related papers (2024-08-20T11:14:21Z)
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base. Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning. On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z)
Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.