Efficient Phishing URL Detection Using Graph-based Machine Learning and Loopy Belief Propagation
- URL: http://arxiv.org/abs/2501.06912v1
- Date: Sun, 12 Jan 2025 19:49:00 GMT
- Title: Efficient Phishing URL Detection Using Graph-based Machine Learning and Loopy Belief Propagation
- Authors: Wenye Guo, Qun Wang, Hao Yue, Haijian Sun, Rose Qingyang Hu,
- Abstract summary: We propose a graph-based machine learning model for phishing URL detection.
We integrate URL structure and network-level features such as IP addresses and authoritative name servers.
Experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77%.
- Score: 12.89058029173131
- License:
- Abstract: The proliferation of mobile devices and online interactions have been threatened by different cyberattacks, where phishing attacks and malicious Uniform Resource Locators (URLs) pose significant risks to user security. Traditional phishing URL detection methods primarily rely on URL string-based features, which attackers often manipulate to evade detection. To address these limitations, we propose a novel graph-based machine learning model for phishing URL detection, integrating both URL structure and network-level features such as IP addresses and authoritative name servers. Our approach leverages Loopy Belief Propagation (LBP) with an enhanced convergence strategy to enable effective message passing and stable classification in the presence of complex graph structures. Additionally, we introduce a refined edge potential mechanism that dynamically adapts based on entity similarity and label relationships to further improve classification accuracy. Comprehensive experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77\%. This robust and reproducible method advances phishing detection capabilities, offering enhanced reliability and valuable insights in the field of cybersecurity.
Related papers
- A New Dataset and Methodology for Malicious URL Classification [2.835223467109843]
Malicious URL (Uniform Resource Locator) classification is a pivotal aspect of Cybersecurity, offering defense against web-based threats.
Despite deep learning's promise in this area, its advancement is hindered by two main challenges: the scarcity of comprehensive, open-source datasets and the limitations of existing models.
We introduce a novel, multi-class dataset for malicious URL classification, distinguishing between benign, phishing and malicious URLs, named DeepURLBench.
arXiv Detail & Related papers (2024-12-31T09:10:38Z) - PhishNet: A Phishing Website Detection Tool using XGBoost [1.777434178384403]
PhisNet is a cutting-edge web application designed to detect phishing websites using advanced machine learning.
It aims to help individuals and organizations identify and prevent phishing attacks through a robust AI framework.
arXiv Detail & Related papers (2024-06-29T21:31:13Z) - PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis [1.102674168371806]
Phishing URL identification is the best way to address the problem.
Various machine learning and deep learning methods have been proposed to automate the detection of phishing URLs.
We propose a 1D Convolutional Neural Network (CNN) and trained the model with extensive features and a substantial amount of data.
arXiv Detail & Related papers (2024-04-27T17:13:49Z) - Synthetic-To-Real Video Person Re-ID [57.937189569211505]
Person re-identification (Re-ID) is an important task and has significant applications for public security and information forensics.
We investigate a novel and challenging setting of Re-ID, i.e., cross-domain video-based person Re-ID.
We utilize synthetic video datasets as the source domain for training and real-world videos for testing.
arXiv Detail & Related papers (2024-02-03T10:19:21Z) - AntiPhishStack: LSTM-based Stacked Generalization Model for Optimized
Phishing URL Detection [0.32141666878560626]
This paper introduces a two-phase stack generalized model named AntiPhishStack, designed to detect phishing sites.
The model leverages the learning of URLs and character-level TF-IDF features symmetrically, enhancing its ability to combat emerging phishing threats.
Experimental validation on two benchmark datasets, comprising benign and phishing or malicious URLs, demonstrates the model's exceptional performance, achieving a notable 96.04% accuracy compared to existing studies.
arXiv Detail & Related papers (2024-01-17T03:44:27Z) - FLTracer: Accurate Poisoning Attack Provenance in Federated Learning [38.47921452675418]
Federated Learning (FL) is a promising distributed learning approach that enables multiple clients to collaboratively train a shared global model.
Recent studies show that FL is vulnerable to various poisoning attacks, which can degrade the performance of global models or introduce backdoors into them.
We propose FLTracer, the first FL attack framework to accurately detect various attacks and trace the attack time, objective, type, and poisoned location of updates.
arXiv Detail & Related papers (2023-10-20T11:24:38Z) - Streamlining Attack Tree Generation: A Fragment-Based Approach [39.157069600312774]
We present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases.
We also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach.
arXiv Detail & Related papers (2023-10-01T12:41:38Z) - Adaptive Attack Detection in Text Classification: Leveraging Space Exploration Features for Text Sentiment Classification [44.99833362998488]
Adversarial example detection plays a vital role in adaptive cyber defense, especially in the face of rapidly evolving attacks.
We propose a novel approach that leverages the power of BERT (Bidirectional Representations from Transformers) and introduces the concept of Space Exploration Features.
arXiv Detail & Related papers (2023-08-29T23:02:26Z) - An Adversarial Attack Analysis on Malicious Advertisement URL Detection
Framework [22.259444589459513]
Malicious advertisement URLs pose a security risk since they are the source of cyber-attacks.
Existing malicious URL detection techniques are limited and to handle unseen features as well as generalize to test data.
In this study, we extract a novel set of lexical and web-scrapped features and employ machine learning technique to set up system for fraudulent advertisement URLs detection.
arXiv Detail & Related papers (2022-04-27T20:06:22Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - An Automated, End-to-End Framework for Modeling Attacks From
Vulnerability Descriptions [46.40410084504383]
In order to derive a relevant attack graph, up-to-date information on known attack techniques should be represented as interaction rules.
We present a novel, end-to-end, automated framework for modeling new attack techniques from textual description of a security vulnerability.
arXiv Detail & Related papers (2020-08-10T19:27:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.