DNS Typo-squatting Domain Detection: A Data Analytics & Machine Learning
Based Approach
- URL: http://arxiv.org/abs/2012.13604v1
- Date: Fri, 25 Dec 2020 16:51:30 GMT
- Title: DNS Typo-squatting Domain Detection: A Data Analytics & Machine Learning
Based Approach
- Authors: Abdallah Moubayed, MohammadNoor Injadat, Abdallah Shami, Hanan
Lutfiyya
- Abstract summary: Domain Name System (DNS) is a crucial component of current IP-based networks as it is the standard mechanism for name to IP resolution.
Detecting this attack is particularly important as it can be a threat to corporate secrets and can be used to steal information or commit fraud.
In this paper, a machine learning-based approach is proposed to tackle the typosquatting vulnerability.
- Score: 9.006364242523249
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain Name System (DNS) is a crucial component of current IP-based networks
as it is the standard mechanism for name to IP resolution. However, due to its
lack of data integrity and origin authentication processes, it is vulnerable to
a variety of attacks. One such attack is Typosquatting. Detecting this attack
is particularly important as it can be a threat to corporate secrets and can be
used to steal information or commit fraud. In this paper, a machine
learning-based approach is proposed to tackle the typosquatting vulnerability.
To that end, exploratory data analytics is first used to better understand the
trends observed in eight domain name-based extracted features. Furthermore, a
majority voting-based ensemble learning classifier built using five
classification algorithms is proposed that can detect suspicious domains with
high accuracy. Moreover, the observed trends are validated by studying the same
features in an unlabeled dataset using K-means clustering algorithm and through
applying the developed ensemble learning classifier. Results show that
legitimate domains have a smaller domain name length and fewer unique
characters. Moreover, the developed ensemble learning classifier performs
better in terms of accuracy, precision, and F-score. Furthermore, it is shown
that similar trends are observed when clustering is used. However, the number
of domains identified as potentially suspicious is high. Hence, the ensemble
learning classifier is applied with results showing that the number of domains
identified as potentially suspicious is reduced by almost a factor of five
while still maintaining the same trends in terms of features' statistics.
Related papers
- The importance of the clustering model to detect new types of intrusion in data traffic [0.0]
The presented work use K-means algorithm, which is a popular clustering technique.
Data was gathered utilizing Kali Linux environment, cicflowmeter traffic, and Putty Software tools.
The model counted the attacks and assigned numbers to each one of them.
arXiv Detail & Related papers (2024-11-21T19:40:31Z) - Model Evaluation for Domain Identification of Unknown Classes in
Open-World Recognition: A Proposal [0.0]
Open-World Recognition (OWR) is an emerging field that makes a machine learning model competent in rejecting the unknowns.
In this study, we propose an evaluation protocol for estimating a model's capability in separating unknown in-domain (ID) and unknown out-of-domain (OOD)
We experimented with five different domains: garbage, food, dogs, plants, and birds.
arXiv Detail & Related papers (2023-12-09T03:54:25Z) - Low-confidence Samples Matter for Domain Adaptation [47.552605279925736]
Domain adaptation (DA) aims to transfer knowledge from a label-rich source domain to a related but label-scarce target domain.
We propose a novel contrastive learning method by processing low-confidence samples.
We evaluate the proposed method in both unsupervised and semi-supervised DA settings.
arXiv Detail & Related papers (2022-02-06T15:45:45Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Domain-Class Correlation Decomposition for Generalizable Person
Re-Identification [34.813965300584776]
In person re-identification, the domain and class are correlated.
We show that domain adversarial learning will lose certain information about class due to this domain-class correlation.
Our model outperforms the state-of-the-art methods on the large-scale domain generalization Re-ID benchmark.
arXiv Detail & Related papers (2021-06-29T09:45:03Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Instance Level Affinity-Based Transfer for Unsupervised Domain
Adaptation [74.71931918541748]
We propose an instance affinity based criterion for source to target transfer during adaptation, called ILA-DA.
We first propose a reliable and efficient method to extract similar and dissimilar samples across source and target, and utilize a multi-sample contrastive loss to drive the domain alignment process.
We verify the effectiveness of ILA-DA by observing consistent improvements in accuracy over popular domain adaptation approaches on a variety of benchmark datasets.
arXiv Detail & Related papers (2021-04-03T01:33:14Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Ensemble-based Feature Selection and Classification Model for DNS
Typo-squatting Detection [5.785697934050654]
Typo-squatting refers to the registration of a domain name that is extremely similar to that of an existing popular brand.
This paper proposes an ensemble-based feature selection and bagging classification model to detect DNS typo-squatting attack.
arXiv Detail & Related papers (2020-06-08T14:07:19Z) - Improving Domain-Adapted Sentiment Classification by Deep Adversarial
Mutual Learning [51.742040588834996]
Domain-adapted sentiment classification refers to training on a labeled source domain to well infer document-level sentiment on an unlabeled target domain.
We propose a novel deep adversarial mutual learning approach involving two groups of feature extractors, domain discriminators, sentiment classifiers, and label probers.
arXiv Detail & Related papers (2020-02-01T01:22:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.