Beyond the Hype: A Real-World Evaluation of the Impact and Cost of
Machine Learning-Based Malware Detection
- URL: http://arxiv.org/abs/2012.09214v2
- Date: Mon, 15 Mar 2021 17:37:15 GMT
- Title: Beyond the Hype: A Real-World Evaluation of the Impact and Cost of
Machine Learning-Based Malware Detection
- Authors: Robert A. Bridges, Sean Oesch, Miki E. Verma, Michael D. Iannacone,
Kelly M.T. Huffer, Brian Jewell, Jeff A. Nichols, Brian Weber, Justin M.
Beaver, Jared M. Smith, Daniel Scofield, Craig Miles, Thomas Plummer, Mark
Daniell, Anne M. Tall
- Abstract summary: There is a lack of scientific testing of commercially available malware detectors.
We present a scientific evaluation of four market-leading malware detection tools.
Our results show that all four tools have near-perfect precision but alarmingly low recall.
- Score: 5.876081415416375
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: There is a lack of scientific testing of commercially available malware
detectors, especially those that boast accurate classification of
never-before-seen (i.e., zero-day) files using machine learning (ML). The
result is that the efficacy and gaps among the available approaches are opaque,
inhibiting end users from making informed network security decisions and
researchers from targeting gaps in current detectors. In this paper, we present
a scientific evaluation of four market-leading malware detection tools to
assist an organization with two primary questions: (Q1) To what extent do
ML-based tools accurately classify never-before-seen files without sacrificing
detection ability on known files? (Q2) Is it worth purchasing a network-level
malware detector to complement host-based detection? We tested each tool
against 3,536 total files (2,554 or 72% malicious, 982 or 28% benign) including
over 400 zero-day malware, and tested with a variety of file types and
protocols for delivery. We present statistical results on detection time and
accuracy, consider complementary analysis (using multiple tools together), and
provide two novel applications of a recent cost-benefit evaluation procedure by
Iannaconne & Bridges that incorporates all the above metrics into a single
quantifiable cost. While the ML-based tools are more effective at detecting
zero-day files and executables, the signature-based tool may still be an
overall better option. Both network-based tools provide substantial (simulated)
savings when paired with either host tool, yet both show poor detection rates
on protocols other than HTTP or SMTP. Our results show that all four tools have
near-perfect precision but alarmingly low recall, especially on file types
other than executables and office files -- 37% of malware tested, including all
polyglot files, were undetected.
Related papers
- Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors [3.0909095595694724]
This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor.
The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy.
arXiv Detail & Related papers (2023-08-28T18:46:12Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - MOTIF: A Large Malware Reference Dataset with Ground Truth Family Labels [21.050311121388813]
We have created the Malware Open-source Threat Intelligence Family (MOTIF) dataset.
MOTIF contains 3,095 malware samples from 454 families, making it the largest and most diverse public malware dataset.
We provide aliases of the different names used to describe the same malware family, allowing us to benchmark for the first time accuracy of existing tools.
arXiv Detail & Related papers (2021-11-29T23:59:50Z) - Mate! Are You Really Aware? An Explainability-Guided Testing Framework
for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors.
We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware.
Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z) - Leveraging Uncertainty for Improved Static Malware Detection Under
Extreme False Positive Constraints [21.241478970181912]
We show how ensembling and Bayesian treatments of machine learning methods for static malware detection allow for improved identification of model errors.
In particular, we improve the true positive rate (TPR) at an actual realized FPR of 1e-5 from an expected 0.69 for previous methods to 0.80 on the best performing model class on the Sophos industry scale dataset.
arXiv Detail & Related papers (2021-08-09T14:30:23Z) - Towards an Automated Pipeline for Detecting and Classifying Malware
through Machine Learning [0.0]
We propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs)
Given an input PE sample, it is first classified as either malicious or benign.
If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s)
arXiv Detail & Related papers (2021-06-10T10:07:50Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z) - Detecting malicious PDF using CNN [46.86114958340962]
Malicious PDF files represent one of the biggest threats to computer security.
We propose a novel algorithm that uses an ensemble of Convolutional Neural Network (CNN) on the byte level of the file.
We show, using a data set of 90000 files downloadable online, that our approach maintains a high detection rate (94%) of PDF malware.
arXiv Detail & Related papers (2020-07-24T18:27:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.