"Explain, Don't Just Warn!" -- A Real-Time Framework for Generating Phishing Warnings with Contextual Cues
- URL: http://arxiv.org/abs/2505.06836v2
- Date: Fri, 16 May 2025 21:06:08 GMT
- Title: "Explain, Don't Just Warn!" -- A Real-Time Framework for Generating Phishing Warnings with Contextual Cues
- Authors: Sayak Saha Roy, Cesar Torres, Shirin Nilizadeh,
- Abstract summary: Anti-phishing tools typically display generic warnings that offer users limited explanation on why a website is considered malicious.<n>We present PhishXplain, a real-time explainable phishing warning system designed to augment existing detection mechanisms.
- Score: 2.6818118216403497
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anti-phishing tools typically display generic warnings that offer users limited explanation on why a website is considered malicious, which can prevent end-users from developing the mental models needed to recognize phishing cues on their own. This becomes especially problematic when these tools inevitably fail - particularly against evasive threats, and users are found to be ill-equipped to identify and avoid them independently. To address these limitations, we present PhishXplain (PXP), a real-time explainable phishing warning system designed to augment existing detection mechanisms. PXP empowers users by clearly articulating why a site is flagged as malicious, highlighting suspicious elements using a memory-efficient implementation of LLaMA 3.2. It utilizes a structured two-step prompt architecture to identify phishing features, generate contextual explanations, and render annotated screenshots that visually reinforce the warning. Longitudinally implementing PhishXplain over a month on 7,091 live phishing websites, we found that it can generate warnings for 94% of the sites, with a correctness of 96%. We also evaluated PhishXplain through a user study with 150 participants split into two groups: one received conventional, generic warnings, while the other interacted with PXP's explainable alerts. Participants who received the explainable warnings not only demonstrated a significantly better understanding of phishing indicators but also achieved higher accuracy in identifying phishing threats, even without any warning. Moreover, they reported greater satisfaction and trust in the warnings themselves. These improvements were especially pronounced among users with lower initial levels of cybersecurity proficiency and awareness. To encourage the adoption of this framework, we release PhishXplain as a browser extension.
Related papers
- Can Large Language Models Improve Phishing Defense? A Large-Scale Controlled Experiment on Warning Dialogue Explanations [2.854118480747787]
Phishing is a prominent risk in modern cybersecurity, often used to bypass technological defences by exploiting predictable human behaviour.<n>Warning dialogues are a standard mitigation measure, but the lack of explanatory clarity and static content limits their effectiveness.<n>We report on our research to assess the capacity of Large Language Models to generate clear, concise, and scalable explanations for phishing warnings.
arXiv Detail & Related papers (2025-07-10T16:54:05Z) - Attention Slipping: A Mechanistic Understanding of Jailbreak Attacks and Defenses in LLMs [61.916827858666906]
We reveal a universal phenomenon that occurs during jailbreak attacks: Attention Slipping.<n>We show Attention Slipping is consistent across various jailbreak methods, including gradient-based token replacement, prompt-level template refinement, and in-context learning.<n>We propose Attention Sharpening, a new defense that directly counters Attention Slipping by sharpening the attention score distribution using temperature scaling.
arXiv Detail & Related papers (2025-07-06T12:19:04Z) - EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability [44.2907457629342]
EXPLICATE is a framework that enhances phishing detection through a three-component architecture.<n>It is on par with existing deep learning techniques but has better explainability.<n>It addresses the critical divide between automated AI and user trust in phishing detection systems.
arXiv Detail & Related papers (2025-03-22T23:37:35Z) - Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense [55.77152277982117]
We introduce Layer-AdvPatcher, a methodology designed to defend against jailbreak attacks.<n>We use an unlearning strategy to patch specific layers within large language models through self-augmented datasets.<n>Our framework reduces the harmfulness and attack success rate of jailbreak attacks.
arXiv Detail & Related papers (2025-01-05T19:06:03Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Eyes on the Phish(er): Towards Understanding Users' Email Processing Pattern and Mental Models in Phishing Detection [0.4543820534430522]
This study examines how workload affects susceptibility to phishing.
We use eye-tracking technology to observe participants' reading patterns and interactions with phishing emails.
Our results provide concrete evidence that attention to the email sender can reduce phishing susceptibility.
arXiv Detail & Related papers (2024-09-12T02:57:49Z) - Protecting Onion Service Users Against Phishing [1.6435014180036467]
Phishing websites are a common phenomenon among Tor onion services.
phishers exploit that it is tremendously difficult to distinguish phishing from authentic onion domain names.
Operators of onion services devised several strategies to protect their users against phishing.
None protect users against phishing without producing traces about visited services.
arXiv Detail & Related papers (2024-08-14T19:51:30Z) - PhishLang: A Real-Time, Fully Client-Side Phishing Detection Framework Using MobileBERT [3.014087730099599]
We introduce PhishLang, the first fully client-side anti-phishing framework built on a lightweight ensemble framework.<n> PhishLang employs a multi-modal ensemble approach, combining both the URL and Source detection models.<n>We release PhishLang as a Chromium browser extension and also open-source the framework to aid the research community.
arXiv Detail & Related papers (2024-08-11T01:14:13Z) - From ML to LLM: Evaluating the Robustness of Phishing Webpage Detection Models against Adversarial Attacks [0.8050163120218178]
Phishing attacks attempt to deceive users into stealing sensitive information.<n>Current detection solutions remain vulnerable to adversarial attacks.<n>We develop a tool that generates adversarial phishing webpages by embedding diverse phishing features into legitimate webpages.
arXiv Detail & Related papers (2024-07-29T18:21:34Z) - "Are Adversarial Phishing Webpages a Threat in Reality?" Understanding the Users' Perception of Adversarial Webpages [21.474375992224633]
Machine learning based phishing website detectors (ML-PWD) are a critical part of today's anti-phishing solutions in operation.
We show that adversarial phishing is a threat to both users and ML-PWD.
We also show that users' self-reported frequency of visiting a brand's website has a statistically negative correlation with their phishing detection accuracy.
arXiv Detail & Related papers (2024-04-03T16:10:17Z) - Certifying LLM Safety against Adversarial Prompting [70.96868018621167]
Large language models (LLMs) are vulnerable to adversarial attacks that add malicious tokens to an input prompt.<n>We introduce erase-and-check, the first framework for defending against adversarial prompts with certifiable safety guarantees.
arXiv Detail & Related papers (2023-09-06T04:37:20Z) - Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data.
We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z) - Phishing and Spear Phishing: examples in Cyber Espionage and techniques
to protect against them [91.3755431537592]
Phishing attacks have become the most used technique in the online scams, initiating more than 91% of cyberattacks, from 2012 onwards.
This study reviews how Phishing and Spear Phishing attacks are carried out by the phishers, through 5 steps which magnify the outcome.
arXiv Detail & Related papers (2020-05-31T18:10:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.