Phishing URL Detection Through Top-level Domain Analysis: A Descriptive
Approach
- URL: http://arxiv.org/abs/2005.06599v1
- Date: Wed, 13 May 2020 21:41:29 GMT
- Title: Phishing URL Detection Through Top-level Domain Analysis: A Descriptive
Approach
- Authors: Orestis Christou and Nikolaos Pitropakis and Pavlos Papadopoulos and
Sean McKeown and William J. Buchanan
- Abstract summary: This study aims to develop a machine-learning model to detect fraudulent URLs which can be used within the Splunk platform.
Inspired from similar approaches in the literature, we trained the SVM and Random Forests algorithms using malicious and benign datasets.
We evaluated the algorithms' performance with precision and recall, reaching up to 85% precision and 87% recall in the case of Random Forests.
- Score: 3.494620587853103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Phishing is considered to be one of the most prevalent cyber-attacks because
of its immense flexibility and alarmingly high success rate. Even with adequate
training and high situational awareness, it can still be hard for users to
continually be aware of the URL of the website they are visiting. Traditional
detection methods rely on blocklists and content analysis, both of which
require time-consuming human verification. Thus, there have been attempts
focusing on the predictive filtering of such URLs. This study aims to develop a
machine-learning model to detect fraudulent URLs which can be used within the
Splunk platform. Inspired from similar approaches in the literature, we trained
the SVM and Random Forests algorithms using malicious and benign datasets found
in the literature and one dataset that we created. We evaluated the algorithms'
performance with precision and recall, reaching up to 85% precision and 87%
recall in the case of Random Forests while SVM achieved up to 90% precision and
88% recall using only descriptive features.
Related papers
- Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy [65.80757820884476]
We expose a critical yet underexplored vulnerability in the deployment of unlearning systems.
We present a threat model where an attacker can degrade model accuracy by submitting adversarial unlearning requests for data not present in the training set.
We evaluate various verification mechanisms to detect the legitimacy of unlearning requests and reveal the challenges in verification.
arXiv Detail & Related papers (2024-10-12T16:47:04Z) - Towards Robust IoT Defense: Comparative Statistics of Attack Detection in Resource-Constrained Scenarios [1.3812010983144802]
Resource constraints pose a significant cybersecurity threat to IoT smart devices.
We conduct an extensive statistical analysis of cyberattack detection algorithms under resource constraints to identify the most efficient one.
arXiv Detail & Related papers (2024-10-10T10:58:03Z) - Hybrid Machine Learning Approach For Real-Time Malicious Url Detection Using Som-Rmo And Rbfn With Tabu Search Optimization [0.0]
The proliferation of malicious URLs has become a significant threat to internet security.
Traditional detection methods struggle to keep pace with the evolving nature of these threats.
We propose a hybrid machine learning approach combining efficient feature extraction with accurate classification.
arXiv Detail & Related papers (2024-07-05T07:24:49Z) - PhishSim: Aiding Phishing Website Detection with a Feature-Free Tool [12.468922937529966]
We propose a feature-free method for detecting phishing websites using the Normalized Compression Distance (NCD)
This measure computes the similarity of two websites by compressing them, thus eliminating the need to perform any feature extraction.
We use the Furthest Point First algorithm to perform phishing prototype extractions, in order to select instances that are representative of a cluster of phishing webpages.
arXiv Detail & Related papers (2022-07-13T20:44:03Z) - An Adversarial Attack Analysis on Malicious Advertisement URL Detection
Framework [22.259444589459513]
Malicious advertisement URLs pose a security risk since they are the source of cyber-attacks.
Existing malicious URL detection techniques are limited and to handle unseen features as well as generalize to test data.
In this study, we extract a novel set of lexical and web-scrapped features and employ machine learning technique to set up system for fraudulent advertisement URLs detection.
arXiv Detail & Related papers (2022-04-27T20:06:22Z) - UNBUS: Uncertainty-aware Deep Botnet Detection System in Presence of
Perturbed Samples [1.2691047660244335]
Botnet detection requires extremely low false-positive rates (FPR), which are not commonly attainable in contemporary deep learning.
In this paper, two LSTM-based classification algorithms for botnet classification with an accuracy higher than 98% are presented.
arXiv Detail & Related papers (2022-04-18T21:49:14Z) - Deep convolutional forest: a dynamic deep ensemble approach for spam
detection in text [219.15486286590016]
This paper introduces a dynamic deep ensemble model for spam detection that adjusts its complexity and extracts features automatically.
As a result, the model achieved high precision, recall, f1-score and accuracy of 98.38%.
arXiv Detail & Related papers (2021-10-10T17:19:37Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z) - UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional
Variational Autoencoders [81.5490760424213]
We propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network.
arXiv Detail & Related papers (2020-04-13T04:12:59Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.