Open SESAME: Fighting Botnets with Seed Reconstructions of Domain
Generation Algorithms
- URL: http://arxiv.org/abs/2301.05048v1
- Date: Thu, 12 Jan 2023 14:25:31 GMT
- Title: Open SESAME: Fighting Botnets with Seed Reconstructions of Domain
Generation Algorithms
- Authors: Nils Weissgerber, Thorsten Jenke, Elmar Padilla, Lilli Bruckschen
- Abstract summary: Bots can generate pseudorandom domain names using Domain Generation Algorithms (DGAs)
A cyber criminal can register such domains to establish periodically changing rendezvous points with the bots.
We introduce SESAME, a system that combines the two above-mentioned approaches and contains a module for automatic Seed Reconstruction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An important aspect of many botnets is their capability to generate
pseudorandom domain names using Domain Generation Algorithms (DGAs). A cyber
criminal can register such domains to establish periodically changing
rendezvous points with the bots. DGAs make use of seeds to generate sets of
domains. Seeds can easily be changed in order to generate entirely new groups
of domains while using the same underlying algorithm. While this requires very
little manual effort for an adversary, security specialists typically have to
manually reverse engineer new malware strains to reconstruct the seeds. Only
when the seed and DGA are known, past and future domains can be generated,
efficiently attributed, blocked, sinkholed or used for a take-down. Common
counters in the literature consist of databases or Machine Learning (ML) based
detectors to keep track of past and future domains of known DGAs and to
identify DGA-generated domain names, respectively. However, database based
approaches can not detect domains generated by new DGAs, and ML approaches can
not generate future domain names. In this paper, we introduce SESAME, a system
that combines the two above-mentioned approaches and contains a module for
automatic Seed Reconstruction, which is, to our knowledge, the first of its
kind. It is used to automatically classify domain names, rate their novelty,
and determine the seeds of the underlying DGAs. SESAME consists of multiple
DGA-specific Seed Reconstructors and is designed to work purely based on domain
names, as they are easily obtainable from observing the network traffic. We
evaluated our approach on 20.8 gigabytes of DNS-lookups. Thereby, we identified
17 DGAs, of which 4 were entirely new to us.
Related papers
- Domain Re-Modulation for Few-Shot Generative Domain Adaptation [71.47730150327818]
Generative Domain Adaptation (GDA) involves transferring a pre-trained generator from one domain to a new domain using only a few reference images.
Inspired by the way human brains acquire knowledge in new domains, we present an innovative generator structure called Domain Re-Modulation (DoRM)
DoRM not only meets the criteria of high quality, large synthesis diversity, and cross-domain consistency, but also incorporates memory and domain association.
arXiv Detail & Related papers (2023-02-06T03:55:35Z) - Domain Expansion of Image Generators [80.8601805917418]
We propose a new task - domain expansion - to address this.
Given a pretrained generator and novel (but related) domains, we expand the generator to jointly model all domains, old and new, harmoniously.
Using our expansion method, one "expanded" model can supersede numerous domain-specific models, without expanding the model size.
arXiv Detail & Related papers (2023-01-12T18:59:47Z) - Detecting Unknown DGAs without Context Information [3.8424737607413153]
New malware often incorporates Domain Generation Algorithms (DGAs) to avoid blocking the malware's connection to the command and control (C2) server.
Current state-of-the-art classifiers are able to separate benign from malicious domains (binary classification) and attribute them with high probability to the DGAs that generated them (multiclass classification)
While binary classifiers can label domains of yet unknown DGAs as malicious, multiclass classifiers can only assign domains to DGAs that are known at the time of training, limiting the ability to uncover new malware families.
arXiv Detail & Related papers (2022-05-30T09:08:50Z) - Domain Invariant Masked Autoencoders for Self-supervised Learning from
Multi-domains [73.54897096088149]
We propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains.
The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image.
Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods.
arXiv Detail & Related papers (2022-05-10T09:49:40Z) - Dynamic Instance Domain Adaptation [109.53575039217094]
Most studies on unsupervised domain adaptation assume that each domain's training samples come with domain labels.
We develop a dynamic neural network with adaptive convolutional kernels to generate instance-adaptive residuals to adapt domain-agnostic deep features to each individual instance.
Our model, dubbed DIDA-Net, achieves state-of-the-art performance on several commonly used single-source and multi-source UDA datasets.
arXiv Detail & Related papers (2022-03-09T20:05:54Z) - FRIDA -- Generative Feature Replay for Incremental Domain Adaptation [34.00059350161178]
We propose a novel framework called Feature based Incremental Domain Adaptation (FRIDA)
For domain alignment, we propose a simple extension of the popular domain adversarial neural network (DANN) called DANN-IB.
Experiment results on Office-Home, Office-CalTech, and DomainNet datasets confirm that FRIDA maintains superior stability-plasticity trade-off than the literature.
arXiv Detail & Related papers (2021-12-28T22:24:32Z) - Improving DGA-Based Malicious Domain Classifiers for Malware Defense
with Adversarial Machine Learning [0.9023847175654603]
Domain Generation Algorithms (DGAs) are used by adversaries to establish Command and Control (C&C) server communications during cyber attacks.
Blacklists of known/identified C&C domains are often used as one of the defense mechanisms.
We propose a new method using adversarial machine learning to generate never-before-seen malware-related domain families.
arXiv Detail & Related papers (2021-01-02T22:04:22Z) - CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web
to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem.
The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy.
Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z) - Real-Time Detection of Dictionary DGA Network Traffic using Deep
Learning [5.915780927888678]
Botnets and malware avoid detection by static rules engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses.
Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains.
We create a novel hybrid neural network, Bilbo the bagging model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious.
arXiv Detail & Related papers (2020-03-28T14:57:22Z) - Deep Domain-Adversarial Image Generation for Domain Generalisation [115.21519842245752]
Machine learning models typically suffer from the domain shift problem when trained on a source dataset and evaluated on a target dataset of different distribution.
To overcome this problem, domain generalisation (DG) methods aim to leverage data from multiple source domains so that a trained model can generalise to unseen domains.
We propose a novel DG approach based on emphDeep Domain-Adversarial Image Generation (DDAIG)
arXiv Detail & Related papers (2020-03-12T23:17:47Z) - Inline Detection of DGA Domains Using Side Information [5.253305460558346]
Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names.
In recent years, machine learning based systems have been widely used to detect DGAs.
We train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself.
arXiv Detail & Related papers (2020-03-12T11:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.