Related papers: PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation

PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2507.15419v1
Date: Mon, 21 Jul 2025 09:20:43 GMT
Title: PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation
Authors: Wenhao Li, Selvakumar Manickam, Yung-wey Chong, Shankar Karuppayah,
Abstract summary: We propose PhishIntentionLLM, a framework that uncovers phishing intentions from website screenshots.<n>Our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting.<n>We generate a larger dataset of 9K samples for large-scale phishing intention profiling across sectors.
Score: 13.177607247367211
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Phishing websites remain a major cybersecurity threat, yet existing methods primarily focus on detection, while the recognition of underlying malicious intentions remains largely unexplored. To address this gap, we propose PhishIntentionLLM, a multi-agent retrieval-augmented generation (RAG) framework that uncovers phishing intentions from website screenshots. Leveraging the visual-language capabilities of large language models (LLMs), our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting. We construct and release the first phishing intention ground truth dataset (~2K samples) and evaluate the framework using four commercial LLMs. Experimental results show that PhishIntentionLLM achieves a micro-precision of 0.7895 with GPT-4o and significantly outperforms the single-agent baseline with a ~95% improvement in micro-precision. Compared to the previous work, it achieves 0.8545 precision for credential theft, marking a ~4% improvement. Additionally, we generate a larger dataset of ~9K samples for large-scale phishing intention profiling across sectors. This work provides a scalable and interpretable solution for intention-aware phishing analysis.

Related papers

CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale [46.76144797837242]
Large language model (LLM) agents are becoming increasingly skilled at handling cybersecurity tasks autonomously.<n>Existing benchmarks fall short, often failing to capture real-world scenarios or being limited in scope.<n>We introduce CyberGym, a large-scale and high-quality cybersecurity evaluation framework featuring 1,507 real-world vulnerabilities.
arXiv Detail & Related papers (2025-06-03T07:35:14Z)
Phishing URL Detection using Bi-LSTM [0.0]
This paper proposes a deep learning-based approach to classify URLs into four categories: benign, phishing, defacement, and malware.<n> Experimental results on a dataset comprising over 650,000 URLs demonstrate the model's effectiveness, achieving 97% accuracy and significant improvements over traditional techniques.
arXiv Detail & Related papers (2025-04-29T00:55:01Z)
EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability [44.2907457629342]
EXPLICATE is a framework that enhances phishing detection through a three-component architecture.<n>It is on par with existing deep learning techniques but has better explainability.<n>It addresses the critical divide between automated AI and user trust in phishing detection systems.
arXiv Detail & Related papers (2025-03-22T23:37:35Z)
Benchmarking Reasoning Robustness in Large Language Models [76.79744000300363]
We find significant performance degradation on novel or incomplete data.<n>These findings highlight the reliance on recall over rigorous logical inference.<n>This paper introduces a novel benchmark, termed as Math-RoB, that exploits hallucinations triggered by missing information to expose reasoning gaps.
arXiv Detail & Related papers (2025-03-06T15:36:06Z)
PEEK: Phishing Evolution Framework for Phishing Generation and Evolving Pattern Analysis using Large Language Models [10.455333111937598]
Phishing remains a pervasive cyber threat, as attackers craft deceptive emails to lure victims into revealing sensitive information.<n>Deep learning has become a key component in defending against phishing attacks, but these approaches face critical limitations.<n>We propose the first Phishing Evolution FramEworK (PEEK) for augmenting phishing email datasets with respect to quality and diversity.
arXiv Detail & Related papers (2024-11-18T09:03:51Z)
PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection [26.106113544525545]
Phishing attacks are a major threat to online security, exploiting user vulnerabilities to steal sensitive information.<n>Various methods have been developed to counteract phishing, each with varying levels of accuracy, but they also face notable limitations.<n>In this study, we introduce PhishAgent, a multimodal agent that combines a wide range of tools, integrating both online and offline knowledge bases with Multimodal Large Language Models (MLLMs)<n>This combination leads to broader brand coverage, which enhances brand recognition and recall.
arXiv Detail & Related papers (2024-08-20T11:14:21Z)
Automated Phishing Detection Using URLs and Webpages [35.66275851732625]
This project addresses the constraints of traditional reference-based phishing detection by developing an LLM agent framework. This agent harnesses Large Language Models to actively fetch and utilize online information. Our approach has achieved with accuracy of 0.945, significantly outperforms the existing solution(DynaPhish) by 0.445.
arXiv Detail & Related papers (2024-08-03T05:08:27Z)
From ML to LLM: Evaluating the Robustness of Phishing Webpage Detection Models against Adversarial Attacks [0.8050163120218178]
Phishing attacks attempt to deceive users into stealing sensitive information, posing a significant cybersecurity threat.<n>We develop PhishOracle, a tool that generates adversarial phishing webpages by embedding diverse phishing features into legitimate webpages.<n>Our findings highlight the vulnerability of phishing detection models to adversarial attacks, emphasizing the need for more robust detection approaches.
arXiv Detail & Related papers (2024-07-29T18:21:34Z)
G$^2$uardFL: Safeguarding Federated Learning Against Backdoor Attacks through Attributed Client Graph Clustering [116.4277292854053]
Federated Learning (FL) offers collaborative model training without data sharing. FL is vulnerable to backdoor attacks, where poisoned model weights lead to compromised system integrity. We present G$2$uardFL, a protective framework that reinterprets the identification of malicious clients as an attributed graph clustering problem.
arXiv Detail & Related papers (2023-06-08T07:15:04Z)
DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection. Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables. We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z)
Towards Web Phishing Detection Limitations and Mitigation [21.738240693843295]
We show how phishing sites bypass Machine Learning-based detection. Experiments with 100K phishing/benign sites show promising accuracy (98.8%) We propose Anti-SubtlePhish, a more resilient model based on logistic regression.
arXiv Detail & Related papers (2022-04-03T04:26:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.