AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors
- URL: http://arxiv.org/abs/2308.14835v1
- Date: Mon, 28 Aug 2023 18:46:12 GMT
- Title: AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors
- Authors: Robert A. Bridges, Brian Weber, Justin M. Beaver, Jared M. Smith, Miki E. Verma, Savannah Norem, Kevin Spakes, Cory Watson, Jeff A. Nichols, Brian Jewell, Michael. D. Iannacone, Chelsey Dunivan Stahl, Kelly M. T. Huffer, T. Sean Oesch,
- Abstract summary: This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor.
The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy.
- Score: 3.0909095595694724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 100K files (50/50% benign/malicious) with a stratified distribution of file types, including ~1K zero-day program executables (increasing experiment size two orders of magnitude over previous work). We present an evaluation process of delivering a file to a fresh virtual machine donning the detection technology, waiting 90s to allow static detection, then executing the file and waiting another period for dynamic detection; this allows greater fidelity in the observational data than previous experiments, in particular, resource and time-to-detection statistics. To execute all 800K trials (100K files $\times$ 8 tools), a software framework is designed to choreographed the experiment into a completely automated, time-synced, and reproducible workflow with substantial parallelization. A cost-benefit model was configured to integrate the tools' recall, precision, time to detection, and resource requirements into a single comparable quantity by simulating costs of use. This provides a ranking methodology for cyber competitions and a lens through which to reason about the varied statistical viewpoints of the results. These statistical and cost-model results provide insights on state of commercial malware detection.
Related papers
- Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy [65.80757820884476]
We expose a critical yet underexplored vulnerability in the deployment of unlearning systems.
We present a threat model where an attacker can degrade model accuracy by submitting adversarial unlearning requests for data not present in the training set.
We evaluate various verification mechanisms to detect the legitimacy of unlearning requests and reveal the challenges in verification.
arXiv Detail & Related papers (2024-10-12T16:47:04Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Quo Vadis: Hybrid Machine Learning Meta-Model based on Contextual and Behavioral Malware Representations [5.439020425819001]
We propose a hybrid machine learning architecture that simultaneously employs multiple deep learning models.
We report an improved detection rate, above the capabilities of the current state-of-the-art model.
arXiv Detail & Related papers (2022-08-20T05:30:16Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - Comprehensive Efficiency Analysis of Machine Learning Algorithms for
Developing Hardware-Based Cybersecurity Countermeasures [0.0]
Modern computing systems have led cyber adversaries to create more sophisticated malware than was previously available.
Modern detection techniques use the machine learning field and hardware to boost the detection rates of malicious software.
A problem emerges when malware with no comparable HPC values comes into contact with these new techniques.
arXiv Detail & Related papers (2022-01-05T22:08:57Z) - Leveraging Uncertainty for Improved Static Malware Detection Under
Extreme False Positive Constraints [21.241478970181912]
We show how ensembling and Bayesian treatments of machine learning methods for static malware detection allow for improved identification of model errors.
In particular, we improve the true positive rate (TPR) at an actual realized FPR of 1e-5 from an expected 0.69 for previous methods to 0.80 on the best performing model class on the Sophos industry scale dataset.
arXiv Detail & Related papers (2021-08-09T14:30:23Z) - Towards an Automated Pipeline for Detecting and Classifying Malware
through Machine Learning [0.0]
We propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs)
Given an input PE sample, it is first classified as either malicious or benign.
If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s)
arXiv Detail & Related papers (2021-06-10T10:07:50Z) - Beyond the Hype: A Real-World Evaluation of the Impact and Cost of
Machine Learning-Based Malware Detection [5.876081415416375]
There is a lack of scientific testing of commercially available malware detectors.
We present a scientific evaluation of four market-leading malware detection tools.
Our results show that all four tools have near-perfect precision but alarmingly low recall.
arXiv Detail & Related papers (2020-12-16T19:10:00Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z) - AutoOD: Automated Outlier Detection via Curiosity-guided Search and
Self-imitation Learning [72.99415402575886]
Outlier detection is an important data mining task with numerous practical applications.
We propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model.
Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance.
arXiv Detail & Related papers (2020-06-19T18:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.