Related papers: Improving DGA-Based Malicious Domain Classifiers for Malware Defense with Adversarial Machine Learning

Improving DGA-Based Malicious Domain Classifiers for Malware Defense with Adversarial Machine Learning

URL: http://arxiv.org/abs/2101.00521v1
Date: Sat, 2 Jan 2021 22:04:22 GMT
Title: Improving DGA-Based Malicious Domain Classifiers for Malware Defense with Adversarial Machine Learning
Authors: Ibrahim Yilmaz, Ambareen Siraj, Denis Ulybyshev
Abstract summary: Domain Generation Algorithms (DGAs) are used by adversaries to establish Command and Control (C&C) server communications during cyber attacks. Blacklists of known/identified C&C domains are often used as one of the defense mechanisms. We propose a new method using adversarial machine learning to generate never-before-seen malware-related domain families.
Score: 0.9023847175654603
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Domain Generation Algorithms (DGAs) are used by adversaries to establish Command and Control (C\&C) server communications during cyber attacks. Blacklists of known/identified C\&C domains are often used as one of the defense mechanisms. However, since blacklists are static and generated by signature-based approaches, they can neither keep up nor detect never-seen-before malicious domain names. Due to this shortcoming of blacklist domain checking, machine learning algorithms have been used to address the problem to some extent. However, when training is performed with limited datasets, the algorithms are likely to fail in detecting new DGA variants. To mitigate this weakness, we successfully applied a DGA-based malicious domain classifier using the Long Short-Term Memory (LSTM) method with a novel feature engineering technique. Our model's performance shows a higher level of accuracy compared to a previously reported model from prior research. Additionally, we propose a new method using adversarial machine learning to generate never-before-seen malware-related domain families that can be used to illustrate the shortcomings of machine learning algorithms in this regard. Next, we augment the training dataset with new samples such that it makes training of the machine learning models more effective in detecting never-before-seen malicious domain name variants. Finally, to protect blacklists of malicious domain names from disclosure and tampering, we devise secure data containers that store blacklists and guarantee their protection against adversarial access and modifications.

Related papers

MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware. We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph. This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z)
DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification [4.585051136007553]
We introduce DomURLs_BERT, a pre-trained BERT-based encoder for detecting and classifying suspicious/malicious domains and URLs. The proposed encoder outperforms state-of-the-art character-based deep learning models and cybersecurity-focused BERT models across multiple tasks and datasets.
arXiv Detail & Related papers (2024-09-13T18:59:13Z)
DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection. Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables. We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z)
Open SESAME: Fighting Botnets with Seed Reconstructions of Domain Generation Algorithms [0.0]
Bots can generate pseudorandom domain names using Domain Generation Algorithms (DGAs) A cyber criminal can register such domains to establish periodically changing rendezvous points with the bots. We introduce SESAME, a system that combines the two above-mentioned approaches and contains a module for automatic Seed Reconstruction.
arXiv Detail & Related papers (2023-01-12T14:25:31Z)
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples [128.25509832644025]
There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. We present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations.
arXiv Detail & Related papers (2022-12-31T04:26:25Z)
Explaining Machine Learning DGA Detectors from DNS Traffic Data [11.049278217301048]
This work addresses the problem of Explainable ML in the context of botnet and DGA detection. It is the first to concretely break down the decisions of ML classifiers when devised for botnet/DGA detection.
arXiv Detail & Related papers (2022-08-10T11:34:26Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes. We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain [58.30296637276011]
This paper summarizes the latest research on adversarial attacks against security solutions based on machine learning techniques. It is the first to discuss the unique challenges of implementing end-to-end adversarial attacks in the cyber security domain.
arXiv Detail & Related papers (2020-07-05T18:22:40Z)
Real-Time Detection of Dictionary DGA Network Traffic using Deep Learning [5.915780927888678]
Botnets and malware avoid detection by static rules engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. We create a novel hybrid neural network, Bilbo the bagging model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious.
arXiv Detail & Related papers (2020-03-28T14:57:22Z)
Inline Detection of DGA Domains Using Side Information [5.253305460558346]
Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names. In recent years, machine learning based systems have been widely used to detect DGAs. We train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself.
arXiv Detail & Related papers (2020-03-12T11:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.