EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability
- URL: http://arxiv.org/abs/2503.20796v1
- Date: Sat, 22 Mar 2025 23:37:35 GMT
- Title: EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability
- Authors: Bryan Lim, Roman Huerta, Alejandro Sotelo, Anthonie Quintela, Priyanka Kumar,
- Abstract summary: EXPLICATE is a framework that enhances phishing detection through a three-component architecture.<n>It is on par with existing deep learning techniques but has better explainability.<n>It addresses the critical divide between automated AI and user trust in phishing detection systems.
- Score: 44.2907457629342
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sophisticated phishing attacks have emerged as a major cybersecurity threat, becoming more common and difficult to prevent. Though machine learning techniques have shown promise in detecting phishing attacks, they function mainly as "black boxes" without revealing their decision-making rationale. This lack of transparency erodes the trust of users and diminishes their effective threat response. We present EXPLICATE: a framework that enhances phishing detection through a three-component architecture: an ML-based classifier using domain-specific features, a dual-explanation layer combining LIME and SHAP for complementary feature-level insights, and an LLM enhancement using DeepSeek v3 to translate technical explanations into accessible natural language. Our experiments show that EXPLICATE attains 98.4 % accuracy on all metrics, which is on par with existing deep learning techniques but has better explainability. High-quality explanations are generated by the framework with an accuracy of 94.2 % as well as a consistency of 96.8\% between the LLM output and model prediction. We create EXPLICATE as a fully usable GUI application and a light Chrome extension, showing its applicability in many deployment situations. The research shows that high detection performance can go hand-in-hand with meaningful explainability in security applications. Most important, it addresses the critical divide between automated AI and user trust in phishing detection systems.
Related papers
- A Gradient-Optimized TSK Fuzzy Framework for Explainable Phishing Detection [0.0]
Existing phishing detection methods struggle to simultaneously achieve high accuracy and explainability.
We propose a novel phishing URL detection system based on a first-order Takagi-Sugeno-Kang fuzzy inference model optimized through gradient-based techniques.
arXiv Detail & Related papers (2025-04-25T18:31:05Z) - MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models [16.16186929130931]
Smart contract vulnerabilities pose significant security risks to blockchain systems.
We propose a smart contract vulnerability detection framework based on mixture-of-experts tuning (MOE-Tuning) of large language models.
Experiments show that MOS significantly outperforms existing methods with average improvements of 6.32% in F1 score and 4.80% in accuracy.
arXiv Detail & Related papers (2025-04-16T16:33:53Z) - Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection [1.2571354974258824]
We propose a new strategy for taking advantage of Large Language Models (LLMs) in provenance-based threat detection.<n>LLMs offer additional details in provenance data interpretation, leveraging their knowledge of system calls, software identity, and high-level understanding of application execution context.<n>In our evaluation, supervised threat detection achieves a precision of 99.0%, and semi-supervised anomaly detection attains a precision of 96.9%.
arXiv Detail & Related papers (2025-03-24T03:51:09Z) - Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models [53.580928907886324]
Reasoning-Augmented Conversation is a novel multi-turn jailbreak framework.<n>It reformulates harmful queries into benign reasoning tasks.<n>We show that RACE achieves state-of-the-art attack effectiveness in complex conversational scenarios.
arXiv Detail & Related papers (2025-02-16T09:27:44Z) - Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Ownership Verification with Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world applications through retrieval-augmented generation (RAG) mechanisms.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z) - Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection [13.403316050809151]
Large language models (LLMs) have shown limited ability on applied tasks such as vulnerability detection.<n>We propose a prompting strategy that integrates natural language descriptions of vulnerabilities with a contrastive chain-of-thought reasoning approach.
arXiv Detail & Related papers (2024-12-16T18:08:14Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.<n>Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.<n>We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - How Far Have We Gone in Vulnerability Detection Using Large Language
Models [15.09461331135668]
We introduce a comprehensive vulnerability benchmark VulBench.
This benchmark aggregates high-quality data from a wide range of CTF challenges and real-world applications.
We find that several LLMs outperform traditional deep learning approaches in vulnerability detection.
arXiv Detail & Related papers (2023-11-21T08:20:39Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.