Word Embedding Techniques for Malware Evolution Detection
- URL: http://arxiv.org/abs/2103.05759v1
- Date: Sun, 7 Mar 2021 14:55:32 GMT
- Title: Word Embedding Techniques for Malware Evolution Detection
- Authors: Sunhera Paul and Mark Stamp
- Abstract summary: We perform a variety of experiments aimed at detecting points in time where a malware family has likely evolved.
Several malware families are analyzed, each of which includes a number of samples collected over an extended period of time.
Our experiments indicate that improved results are obtained using feature engineering based on word embedding techniques.
- Score: 4.111899441919165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Malware detection is a critical aspect of information security. One
difficulty that arises is that malware often evolves over time. To maintain
effective malware detection, it is necessary to determine when malware
evolution has occurred so that appropriate countermeasures can be taken. We
perform a variety of experiments aimed at detecting points in time where a
malware family has likely evolved, and we consider secondary tests designed to
confirm that evolution has actually occurred. Several malware families are
analyzed, each of which includes a number of samples collected over an extended
period of time. Our experiments indicate that improved results are obtained
using feature engineering based on word embedding techniques. All of our
experiments are based on machine learning models, and hence our evolution
detection strategies require minimal human intervention and can easily be
automated.
Related papers
- A Survey of Malware Detection Using Deep Learning [6.349503549199403]
This paper investigates advances in malware detection on Windows, iOS, Android, and Linux using deep learning (DL)
We discuss the issues and the challenges in malware detection using DL classifiers.
We examine eight popular DL approaches on various datasets.
arXiv Detail & Related papers (2024-07-27T02:49:55Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - A survey on hardware-based malware detection approaches [45.24207460381396]
Hardware-based malware detection approaches leverage hardware performance counters and machine learning prowess.
We meticulously analyze the approach, unraveling the most common methods, algorithms, tools, and datasets that shape its contours.
The discussion extends to crafting mixed hardware and software approaches for collaborative efficacy, essential enhancements in hardware monitoring units, and a better understanding of the correlation between hardware events and malware applications.
arXiv Detail & Related papers (2023-03-22T13:00:41Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Mate! Are You Really Aware? An Explainability-Guided Testing Framework
for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors.
We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware.
Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z) - ML-based IoT Malware Detection Under Adversarial Settings: A Systematic
Evaluation [9.143713488498513]
This work systematically examines the state-of-the-art malware detection approaches, that utilize various representation and learning techniques.
We show that software mutations with functionality-preserving operations, such as stripping and padding, significantly deteriorate the accuracy of such detectors.
arXiv Detail & Related papers (2021-08-30T16:54:07Z) - Machine Learning for Malware Evolution Detection [4.111899441919165]
We perform experiments on a significant number of malware families to determine when malware evolution is likely to have occurred.
We consider analysis based on hidden Markov models (HMM) and the word embedding techniques HMM2Vec and Word2Vec.
arXiv Detail & Related papers (2021-07-04T13:47:06Z) - Evading Malware Classifiers via Monte Carlo Mutant Feature Discovery [23.294653273180472]
We show how a malicious actor trains a surrogate model to discover binary mutations that cause an instance to be misclassified.
Then, mutated malware is sent to the victim model that takes the place of an antivirus API to test whether it can evade detection.
arXiv Detail & Related papers (2021-06-15T03:31:02Z) - Being Single Has Benefits. Instance Poisoning to Deceive Malware
Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier.
As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger.
We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z) - MDEA: Malware Detection with Evolutionary Adversarial Learning [16.8615211682877]
MDEA, an Adversarial Malware Detection model uses evolutionary optimization to create attack samples to make the network robust against evasion attacks.
By retraining the model with the evolved malware samples, its performance improves a significant margin.
arXiv Detail & Related papers (2020-02-09T09:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.