Inline Detection of DGA Domains Using Side Information
- URL: http://arxiv.org/abs/2003.05703v1
- Date: Thu, 12 Mar 2020 11:00:30 GMT
- Title: Inline Detection of DGA Domains Using Side Information
- Authors: Raaghavi Sivaguru, Jonathan Peck, Femi Olumofin, Anderson Nascimento
and Martine De Cock
- Abstract summary: Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names.
In recent years, machine learning based systems have been widely used to detect DGAs.
We train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself.
- Score: 5.253305460558346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Malware applications typically use a command and control (C&C) server to
manage bots to perform malicious activities. Domain Generation Algorithms
(DGAs) are popular methods for generating pseudo-random domain names that can
be used to establish a communication between an infected bot and the C&C
server. In recent years, machine learning based systems have been widely used
to detect DGAs. There are several well known state-of-the-art classifiers in
the literature that can detect DGA domain names in real-time applications with
high predictive performance. However, these DGA classifiers are highly
vulnerable to adversarial attacks in which adversaries purposely craft domain
names to evade DGA detection classifiers. In our work, we focus on hardening
DGA classifiers against adversarial attacks. To this end, we train and evaluate
state-of-the-art deep learning and random forest (RF) classifiers for DGA
detection using side information that is harder for adversaries to manipulate
than the domain name itself. Additionally, the side information features are
selected such that they are easily obtainable in practice to perform inline DGA
detection. The performance and robustness of these models is assessed by
exposing them to one day of real-traffic data as well as domains generated by
adversarial attack algorithms. We found that the DGA classifiers that rely on
both the domain name and side information have high performance and are more
robust against adversaries.
Related papers
- Fine-tuning Large Language Models for DGA and DNS Exfiltration Detection [1.350128573715538]
Large Language Models (LLMs) have demonstrated their proficiency in real-time detection tasks.
Our work validates the effectiveness of fine-tuned LLMs for detecting DGAs and DNS exfiltration attacks.
arXiv Detail & Related papers (2024-10-29T04:22:28Z) - Disentangling Masked Autoencoders for Unsupervised Domain Generalization [57.56744870106124]
Unsupervised domain generalization is fast gaining attention but is still far from well-studied.
Disentangled Masked Auto (DisMAE) aims to discover the disentangled representations that faithfully reveal intrinsic features.
DisMAE co-trains the asymmetric dual-branch architecture with semantic and lightweight variation encoders.
arXiv Detail & Related papers (2024-07-10T11:11:36Z) - Make the U in UDA Matter: Invariant Consistency Learning for
Unsupervised Domain Adaptation [86.61336696914447]
We dub our approach "Invariant CONsistency learning" (ICON)
We propose to make the U in Unsupervised DA matter by giving equal status to the two domains.
ICON achieves the state-of-the-art performance on the classic UDA benchmarks: Office-Home and VisDA-2017, and outperforms all the conventional methods on the challenging WILDS 2.0 benchmark.
arXiv Detail & Related papers (2023-09-22T09:43:32Z) - Open SESAME: Fighting Botnets with Seed Reconstructions of Domain
Generation Algorithms [0.0]
Bots can generate pseudorandom domain names using Domain Generation Algorithms (DGAs)
A cyber criminal can register such domains to establish periodically changing rendezvous points with the bots.
We introduce SESAME, a system that combines the two above-mentioned approaches and contains a module for automatic Seed Reconstruction.
arXiv Detail & Related papers (2023-01-12T14:25:31Z) - Detecting Unknown DGAs without Context Information [3.8424737607413153]
New malware often incorporates Domain Generation Algorithms (DGAs) to avoid blocking the malware's connection to the command and control (C2) server.
Current state-of-the-art classifiers are able to separate benign from malicious domains (binary classification) and attribute them with high probability to the DGAs that generated them (multiclass classification)
While binary classifiers can label domains of yet unknown DGAs as malicious, multiclass classifiers can only assign domains to DGAs that are known at the time of training, limiting the ability to uncover new malware families.
arXiv Detail & Related papers (2022-05-30T09:08:50Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Improving DGA-Based Malicious Domain Classifiers for Malware Defense
with Adversarial Machine Learning [0.9023847175654603]
Domain Generation Algorithms (DGAs) are used by adversaries to establish Command and Control (C&C) server communications during cyber attacks.
Blacklists of known/identified C&C domains are often used as one of the defense mechanisms.
We propose a new method using adversarial machine learning to generate never-before-seen malware-related domain families.
arXiv Detail & Related papers (2021-01-02T22:04:22Z) - CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web
to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem.
The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy.
Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z) - Adversarial Attack on Large Scale Graph [58.741365277995044]
Recent studies have shown that graph neural networks (GNNs) are vulnerable against perturbations due to lack of robustness.
Currently, most works on attacking GNNs are mainly using gradient information to guide the attack and achieve outstanding performance.
We argue that the main reason is that they have to use the whole graph for attacks, resulting in the increasing time and space complexity as the data scale grows.
We present a practical metric named Degree Assortativity Change (DAC) to measure the impacts of adversarial attacks on graph data.
arXiv Detail & Related papers (2020-09-08T02:17:55Z) - Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged
Fraudsters [78.53851936180348]
We introduce two types of camouflages based on recent empirical studies, i.e., the feature camouflage and the relation camouflage.
Existing GNNs have not addressed these two camouflages, which results in their poor performance in fraud detection problems.
We propose a new model named CAmouflage-REsistant GNN (CARE-GNN) to enhance the GNN aggregation process with three unique modules against camouflages.
arXiv Detail & Related papers (2020-08-19T22:33:12Z) - Real-Time Detection of Dictionary DGA Network Traffic using Deep
Learning [5.915780927888678]
Botnets and malware avoid detection by static rules engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses.
Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains.
We create a novel hybrid neural network, Bilbo the bagging model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious.
arXiv Detail & Related papers (2020-03-28T14:57:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.