Related papers: Mitigating Bias in Machine Learning Models for Phishing Webpage Detection

Mitigating Bias in Machine Learning Models for Phishing Webpage Detection

URL: http://arxiv.org/abs/2401.08363v1
Date: Tue, 16 Jan 2024 13:45:54 GMT
Title: Mitigating Bias in Machine Learning Models for Phishing Webpage Detection
Authors: Aditya Kulkarni, Vivek Balachandran, Dinil Mon Divakaran, Tamal Das,
Abstract summary: Phishing, a well-known cyberattack, revolves around the creation of phishing webpages and the dissemination of corresponding URLs. Various techniques are available for preemptively categorizing zero-day phishing URLs by distilling unique attributes and constructing predictive models. This proposal delves into persistent challenges within phishing detection solutions, particularly concentrated on the preliminary phase of assembling comprehensive datasets. We propose a potential solution in the form of a tool engineered to alleviate bias in ML models.
Score: 0.8050163120218178
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: The widespread accessibility of the Internet has led to a surge in online fraudulent activities, underscoring the necessity of shielding users' sensitive information from cybercriminals. Phishing, a well-known cyberattack, revolves around the creation of phishing webpages and the dissemination of corresponding URLs, aiming to deceive users into sharing their sensitive information, often for identity theft or financial gain. Various techniques are available for preemptively categorizing zero-day phishing URLs by distilling unique attributes and constructing predictive models. However, these existing techniques encounter unresolved issues. This proposal delves into persistent challenges within phishing detection solutions, particularly concentrated on the preliminary phase of assembling comprehensive datasets, and proposes a potential solution in the form of a tool engineered to alleviate bias in ML models. Such a tool can generate phishing webpages for any given set of legitimate URLs, infusing randomly selected content and visual-based phishing features. Furthermore, we contend that the tool holds the potential to assess the efficacy of existing phishing detection solutions, especially those trained on confined datasets.

Related papers

EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability [44.2907457629342]
EXPLICATE is a framework that enhances phishing detection through a three-component architecture. It is on par with existing deep learning techniques but has better explainability. It addresses the critical divide between automated AI and user trust in phishing detection systems.
arXiv Detail & Related papers (2025-03-22T23:37:35Z)
Web Phishing Net (WPN): A scalable machine learning approach for real-time phishing campaign detection [0.0]
Phishing is the most prevalent type of cyber-attack today and is recognized as the leading source of data breaches. In this paper, we propose an unsupervised learning approach that is fast but scalable. It is able to detect entire campaigns at a time with a high detection rate while preserving user privacy.
arXiv Detail & Related papers (2025-02-17T15:06:56Z)
PhishIntel: Toward Practical Deployment of Reference-Based Phishing Detection [33.98293686647553]
PhishIntel is an end-to-end phishing detection system for real-world deployment. It segmenting the detection process into two distinct tasks: a fast task that checks against local blacklists and result cache, and a slow task that conducts online blacklist verification, URL crawling, and webpage analysis. This fast-slow task system architecture ensures low response latency while retaining the robust detection capabilities of reference-based phishing detectors.
arXiv Detail & Related papers (2024-12-12T08:33:39Z)
Adapting to Cyber Threats: A Phishing Evolution Network (PEN) Framework for Phishing Generation and Analyzing Evolution Patterns using Large Language Models [10.58220151364159]
Phishing remains a pervasive cyber threat, as attackers craft deceptive emails to lure victims into revealing sensitive information. While Artificial Intelligence (AI) has become a key component in defending against phishing attacks, these approaches face critical limitations. We propose the Phishing Evolution Network (PEN), a framework leveraging large language models (LLMs) and adversarial training mechanisms to continuously generate high quality and realistic diverse phishing samples.
arXiv Detail & Related papers (2024-11-18T09:03:51Z)
Countering Autonomous Cyber Threats [40.00865970939829]
Foundation Models present dual-use concerns broadly and within the cyber domain specifically. Recent research has shown the potential for these advanced models to inform or independently execute offensive cyberspace operations. This work evaluates several state-of-the-art FMs on their ability to compromise machines in an isolated network and investigates defensive mechanisms to defeat such AI-powered attacks.
arXiv Detail & Related papers (2024-10-23T22:46:44Z)
On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective [39.676548104635096]
Safeguarding the intellectual property of machine learning models has emerged as a pressing concern in AI security. Model watermarking is a powerful technique for protecting ownership of machine learning models. We propose a novel model watermarking scheme, In-distribution Watermark Embedding (IWE), to overcome the limitations of existing method.
arXiv Detail & Related papers (2024-09-10T00:55:21Z)
Automated Phishing Detection Using URLs and Webpages [35.66275851732625]
This project addresses the constraints of traditional reference-based phishing detection by developing an LLM agent framework. This agent harnesses Large Language Models to actively fetch and utilize online information. Our approach has achieved with accuracy of 0.945, significantly outperforms the existing solution(DynaPhish) by 0.445.
arXiv Detail & Related papers (2024-08-03T05:08:27Z)
From ML to LLM: Evaluating the Robustness of Phishing Webpage Detection Models against Adversarial Attacks [0.8050163120218178]
Phishing attacks attempt to deceive users into stealing sensitive information. Current phishing webpage detection solutions are vulnerable to adversarial attacks. We develop a tool that generates adversarial phishing webpages by embedding diverse phishing features into legitimate webpages.
arXiv Detail & Related papers (2024-07-29T18:21:34Z)
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey [40.11614155244292]
As AI-generated media become more realistic, the risk of misuse to spread misinformation and commit identity fraud increases. This work traces the evolution from traditional single-modality methods to sophisticated multi-modal approaches that handle audio-visual and text-visual scenarios. To our knowledge, this is the first survey of its kind.
arXiv Detail & Related papers (2024-06-11T05:48:04Z)
PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis [1.102674168371806]
Phishing URL identification is the best way to address the problem. Various machine learning and deep learning methods have been proposed to automate the detection of phishing URLs. We propose a 1D Convolutional Neural Network (CNN) and trained the model with extensive features and a substantial amount of data.
arXiv Detail & Related papers (2024-04-27T17:13:49Z)
A Sophisticated Framework for the Accurate Detection of Phishing Websites [0.0]
Phishing is an increasingly sophisticated form of cyberattack that is inflicting huge financial damage to corporations throughout the globe. This paper proposes a comprehensive methodology for detecting phishing websites. A combination of feature selection, greedy algorithm, cross-validation, and deep learning methods have been utilized to construct a sophisticated stacking ensemble.
arXiv Detail & Related papers (2024-03-13T14:26:25Z)
Crafter: Facial Feature Crafting against Inversion-based Identity Theft on Deep Models [45.398313126020284]
A typical application is to run machine learning services on facial images collected from different individuals. To prevent identity theft, conventional methods rely on an adversarial game-based approach to shed the identity information from the feature. We propose Crafter, a feature crafting mechanism deployed at the edge, to protect the identity information from adaptive model attacks.
arXiv Detail & Related papers (2024-01-14T05:06:42Z)
A new weighted ensemble model for phishing detection based on feature selection [0.0]
Phishing website identification can assist visitors in avoiding becoming victims of these assaults. We have proposed an ensemble model that combines multiple base models with a voting technique based on the weights.
arXiv Detail & Related papers (2022-12-15T23:15:36Z)
Deep convolutional forest: a dynamic deep ensemble approach for spam detection in text [219.15486286590016]
This paper introduces a dynamic deep ensemble model for spam detection that adjusts its complexity and extracts features automatically. As a result, the model achieved high precision, recall, f1-score and accuracy of 98.38%.
arXiv Detail & Related papers (2021-10-10T17:19:37Z)
Phishing and Spear Phishing: examples in Cyber Espionage and techniques to protect against them [91.3755431537592]
Phishing attacks have become the most used technique in the online scams, initiating more than 91% of cyberattacks, from 2012 onwards. This study reviews how Phishing and Spear Phishing attacks are carried out by the phishers, through 5 steps which magnify the outcome.
arXiv Detail & Related papers (2020-05-31T18:10:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.