Related papers: CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection

CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection

URL: http://arxiv.org/abs/2510.18585v1
Date: Tue, 21 Oct 2025 12:38:52 GMT
Title: CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection
Authors: Fouad Trad, Ali Chehab,
Abstract summary: We present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents.<n>The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content.<n>CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset.
Score: 0.8737375836744933
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Phishing websites remain a significant cybersecurity threat, necessitating accurate and cost-effective detection mechanisms. In this paper, we present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents, built using large language models (LLMs), to analyze different aspects of a web resource. The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content to predict potential phishing threats. To optimize performance while minimizing operational costs, we experimented with multiple combination strategies for agent-based analysis, ultimately designing a strategic combination that ensures the per-website evaluation expense remains minimal without compromising detection accuracy. We tested various LLMs, including Gemini 1.5 Flash and GPT-4o mini, to build these agents and found that Gemini 1.5 Flash achieved the best performance with an F1 score of 83.01% on a newly curated dataset. Also, the system maintained an average processing time of 2.78 seconds per website and an API cost of around $3.18 per 1,000 websites. Moreover, CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset. To support further research, we have made our dataset publicly available, supporting the development of more advanced phishing detection systems.

Related papers

MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection [15.810091292280584]
MemoPhishAgent (MPA) is a memory-augmented multi-modal LLM agent that orchestrates phishing-specific tools.<n>MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6%.<n>MPA is deployed in production, processing 60K targeted high-risk URLs weekly, and achieving 91.44% recall.
arXiv Detail & Related papers (2026-02-24T21:50:46Z)
Benchmarking Large Language Models for Zero-shot and Few-shot Phishing URL Detection [0.0]
Deceptive URLs have reached unprecedented sophistication due to the widespread use of generative AI by cybercriminals.<n> phishing volume has escalated over 4,000% since 2022, with nearly 50% more attacks evading detection.<n>We present a benchmark of LLMs under a unified zero-shot and few-shot prompting framework.
arXiv Detail & Related papers (2026-02-02T18:56:06Z)
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents [76.12500510390439]
Web agents powered by large language models (LLMs) must process lengthy web page observations to complete user goals.<n>Existing pruning strategies either discard relevant content or retain irrelevant context, leading to suboptimal action prediction.<n>We introduce FocusAgent, a simple yet effective approach that leverages a lightweight LLM retriever to extract the most relevant lines from accessibility tree (AxTree) observations.
arXiv Detail & Related papers (2025-10-03T17:41:30Z)
Characterizing Phishing Pages by JavaScript Capabilities [77.64740286751834]
This paper aims to aid researchers and analysts by automatically differentiating groups of phishing pages based on the underlying kit.<n>For kit detection, our system has an accuracy of 97% on a ground-truth dataset of 548 kit families deployed across 4,562 phishing URLs.<n>We find that UI interactivity and basic fingerprinting are universal techniques, present in 90% and 80% of the clusters.
arXiv Detail & Related papers (2025-09-16T15:39:23Z)
VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z)
MCP-Orchestrated Multi-Agent System for Automated Disinformation Detection [84.75972919995398]
This paper presents a multi-agent system that uses relation extraction to detect disinformation in news articles.<n>The proposed Agentic AI system combines four agents: (i) a machine learning agent (logistic regression), (ii) a Wikipedia knowledge check agent, and (iv) a web-scraped data analyzer.<n>Results demonstrate that the multi-agent ensemble achieves 95.3% accuracy with an F1 score of 0.964, significantly outperforming individual agents and traditional approaches.
arXiv Detail & Related papers (2025-08-13T19:14:48Z)
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive.<n>We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons.<n>The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
Large Multimodal Agents for Accurate Phishing Detection with Enhanced Token Optimization and Cost Reduction [2.8161155726745237]
This paper explores the use of large multimodal agents, specifically Gemini 1.5 Flash and GPT-4o mini, to analyze both URLs and webpage screenshots via APIs.<n>We propose a two-tiered agentic approach: one agent assesses the URL, and if inconclusive, a second agent evaluates both the URL and the screenshot.<n>Cost analysis shows that with the agentic approach, GPT-4o mini can process about 4.2 times as many websites per $100 compared to the multimodal approach.
arXiv Detail & Related papers (2024-12-03T09:13:52Z)
Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)
Position Paper: Think Globally, React Locally -- Bringing Real-time Reference-based Website Phishing Detection on macOS [0.4962561299282114]
The recent surge in phishing attacks keeps undermining the effectiveness of the traditional anti-phishing blacklist approaches. On-device anti-phishing solutions are gaining popularity as they offer faster phishing detection locally. We propose a phishing detection solution that uses a combination of computer vision and on-device machine learning models to analyze websites in real time.
arXiv Detail & Related papers (2024-05-28T14:46:03Z)
A Sophisticated Framework for the Accurate Detection of Phishing Websites [0.0]
Phishing is an increasingly sophisticated form of cyberattack that is inflicting huge financial damage to corporations throughout the globe. This paper proposes a comprehensive methodology for detecting phishing websites. A combination of feature selection, greedy algorithm, cross-validation, and deep learning methods have been utilized to construct a sophisticated stacking ensemble.
arXiv Detail & Related papers (2024-03-13T14:26:25Z)
Reliability and Robustness analysis of Machine Learning based Phishing URL Detectors [0.0]
ML-based Phishing URL (MLPU) detectors serve as the first level of defence to protect users and organisations from being victims of phishing attacks. We have proposed a methodology to investigate the reliability and robustness of 50 representative state-of-the-artU models. We analyzed their robustness and reliability using box plots and heat maps.
arXiv Detail & Related papers (2020-05-18T04:24:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.