Efficient Phishing URL Detection Using Graph-based Machine Learning and Loopy Belief Propagation
- URL: http://arxiv.org/abs/2501.06912v1
- Date: Sun, 12 Jan 2025 19:49:00 GMT
- Title: Efficient Phishing URL Detection Using Graph-based Machine Learning and Loopy Belief Propagation
- Authors: Wenye Guo, Qun Wang, Hao Yue, Haijian Sun, Rose Qingyang Hu,
- Abstract summary: We propose a graph-based machine learning model for phishing URL detection.<n>We integrate URL structure and network-level features such as IP addresses and authoritative name servers.<n>Experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77%.
- Score: 12.89058029173131
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proliferation of mobile devices and online interactions have been threatened by different cyberattacks, where phishing attacks and malicious Uniform Resource Locators (URLs) pose significant risks to user security. Traditional phishing URL detection methods primarily rely on URL string-based features, which attackers often manipulate to evade detection. To address these limitations, we propose a novel graph-based machine learning model for phishing URL detection, integrating both URL structure and network-level features such as IP addresses and authoritative name servers. Our approach leverages Loopy Belief Propagation (LBP) with an enhanced convergence strategy to enable effective message passing and stable classification in the presence of complex graph structures. Additionally, we introduce a refined edge potential mechanism that dynamically adapts based on entity similarity and label relationships to further improve classification accuracy. Comprehensive experiments on real-world datasets demonstrate our model's effectiveness by achieving F1 score of up to 98.77\%. This robust and reproducible method advances phishing detection capabilities, offering enhanced reliability and valuable insights in the field of cybersecurity.
Related papers
- Phishing URL Detection using Bi-LSTM [0.0]
This paper proposes a deep learning-based approach to classify URLs into four categories: benign, phishing, defacement, and malware.
Experimental results on a dataset comprising over 650,000 URLs demonstrate the model's effectiveness, achieving 97% accuracy and significant improvements over traditional techniques.
arXiv Detail & Related papers (2025-04-29T00:55:01Z) - EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability [44.2907457629342]
EXPLICATE is a framework that enhances phishing detection through a three-component architecture.
It is on par with existing deep learning techniques but has better explainability.
It addresses the critical divide between automated AI and user trust in phishing detection systems.
arXiv Detail & Related papers (2025-03-22T23:37:35Z) - Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting.
Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines.
Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z) - A New Dataset and Methodology for Malicious URL Classification [2.835223467109843]
Malicious URL (Uniform Resource Locator) classification is a pivotal aspect of Cybersecurity, offering defense against web-based threats.<n>Despite deep learning's promise in this area, its advancement is hindered by two main challenges: the scarcity of comprehensive, open-source datasets and the limitations of existing models.<n>We introduce a novel, multi-class dataset for malicious URL classification, distinguishing between benign, phishing and malicious URLs, named DeepURLBench.
arXiv Detail & Related papers (2024-12-31T09:10:38Z) - Automated Phishing Detection Using URLs and Webpages [35.66275851732625]
This project addresses the constraints of traditional reference-based phishing detection by developing an LLM agent framework.
This agent harnesses Large Language Models to actively fetch and utilize online information.
Our approach has achieved with accuracy of 0.945, significantly outperforms the existing solution(DynaPhish) by 0.445.
arXiv Detail & Related papers (2024-08-03T05:08:27Z) - PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis [1.102674168371806]
Phishing URL identification is the best way to address the problem.
Various machine learning and deep learning methods have been proposed to automate the detection of phishing URLs.
We propose a 1D Convolutional Neural Network (CNN) and trained the model with extensive features and a substantial amount of data.
arXiv Detail & Related papers (2024-04-27T17:13:49Z) - AntiPhishStack: LSTM-based Stacked Generalization Model for Optimized
Phishing URL Detection [0.32141666878560626]
This paper introduces a two-phase stack generalized model named AntiPhishStack, designed to detect phishing sites.
The model leverages the learning of URLs and character-level TF-IDF features symmetrically, enhancing its ability to combat emerging phishing threats.
Experimental validation on two benchmark datasets, comprising benign and phishing or malicious URLs, demonstrates the model's exceptional performance, achieving a notable 96.04% accuracy compared to existing studies.
arXiv Detail & Related papers (2024-01-17T03:44:27Z) - FLTracer: Accurate Poisoning Attack Provenance in Federated Learning [38.47921452675418]
Federated Learning (FL) is a promising distributed learning approach that enables multiple clients to collaboratively train a shared global model.
Recent studies show that FL is vulnerable to various poisoning attacks, which can degrade the performance of global models or introduce backdoors into them.
We propose FLTracer, the first FL attack framework to accurately detect various attacks and trace the attack time, objective, type, and poisoned location of updates.
arXiv Detail & Related papers (2023-10-20T11:24:38Z) - Streamlining Attack Tree Generation: A Fragment-Based Approach [39.157069600312774]
We present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases.
We also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach.
arXiv Detail & Related papers (2023-10-01T12:41:38Z) - Adaptive Attack Detection in Text Classification: Leveraging Space Exploration Features for Text Sentiment Classification [44.99833362998488]
Adversarial example detection plays a vital role in adaptive cyber defense, especially in the face of rapidly evolving attacks.
We propose a novel approach that leverages the power of BERT (Bidirectional Representations from Transformers) and introduces the concept of Space Exploration Features.
arXiv Detail & Related papers (2023-08-29T23:02:26Z) - An Adversarial Attack Analysis on Malicious Advertisement URL Detection
Framework [22.259444589459513]
Malicious advertisement URLs pose a security risk since they are the source of cyber-attacks.
Existing malicious URL detection techniques are limited and to handle unseen features as well as generalize to test data.
In this study, we extract a novel set of lexical and web-scrapped features and employ machine learning technique to set up system for fraudulent advertisement URLs detection.
arXiv Detail & Related papers (2022-04-27T20:06:22Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - An Automated, End-to-End Framework for Modeling Attacks From
Vulnerability Descriptions [46.40410084504383]
In order to derive a relevant attack graph, up-to-date information on known attack techniques should be represented as interaction rules.
We present a novel, end-to-end, automated framework for modeling new attack techniques from textual description of a security vulnerability.
arXiv Detail & Related papers (2020-08-10T19:27:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.