Related papers: A Survey of Machine Learning Methods and Challenges for Windows Malware Classification

A Survey of Machine Learning Methods and Challenges for Windows Malware Classification

URL: http://arxiv.org/abs/2006.09271v2
Date: Sun, 15 Nov 2020 16:35:36 GMT
Title: A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
Authors: Edward Raff, Charles Nicholas
Abstract summary: Survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.
Score: 43.4550536920809
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Malware classification is a difficult problem, to which machine learning methods have been applied for decades. Yet progress has often been slow, in part due to a number of unique difficulties with the task that occur through all stages of the developing a machine learning system: data collection, labeling, feature creation and selection, model selection, and evaluation. In this survey we will review a number of the current methods and challenges related to malware classification, including data collection, feature extraction, and model construction, and evaluation. Our discussion will include thoughts on the constraints that must be considered for machine learning based solutions in this domain, and yet to be tackled problems for which machine learning could also provide a solution. This survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.

Related papers

When Machine Learning Meets Vulnerability Discovery: Challenges and Lessons Learned [3.000275719116454]
In this paper, we explore the challenges of applying machine learning to vulnerability discovery.<n>First, researchers often fail to provide concrete statistics about their training datasets.<n> Secondly, the choice of a model and the level of granularity at which models are trained also affect the effectiveness of such vulnerability discovery approaches.
arXiv Detail & Related papers (2025-08-20T20:09:49Z)
Does Machine Unlearning Truly Remove Knowledge? [80.83986295685128]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
Machine Unlearning for Traditional Models and Large Language Models: A Short Survey [11.539080008361662]
Machine unlearning aims to delete data and reduce its impact on models according to user requests. This paper categorizes and investigates unlearning on both traditional models and Large Language Models (LLMs)
arXiv Detail & Related papers (2024-04-01T16:08:18Z)
Machine Unlearning: A Survey [56.79152190680552]
A special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. No study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.
arXiv Detail & Related papers (2023-06-06T10:18:36Z)
Learnware: Small Models Do Big [69.88234743773113]
The prevailing big model paradigm, which has achieved impressive results in natural language processing and computer vision applications, has not yet addressed those issues, whereas becoming a serious source of carbon emissions. This article offers an overview of the learnware paradigm, which attempts to enable users not need to build machine learning models from scratch, with the hope of reusing small models to do things even beyond their original purposes.
arXiv Detail & Related papers (2022-10-07T15:55:52Z)
Deep learning and machine learning for Malaria detection: overview, challenges and future directions [0.0]
This study uses a variety of machine learning and image processing approaches to detect and forecast the malarial illness. In our research, we discovered the potential of deep learning techniques as smart tools with broader applicability for malaria detection.
arXiv Detail & Related papers (2022-09-27T10:33:00Z)
A Survey of Machine Unlearning [56.017968863854186]
Recent regulations now require that, on request, private information about a user must be removed from computer systems. ML models often remember' the old data. Recent works on machine unlearning have not been able to completely solve the problem.
arXiv Detail & Related papers (2022-09-06T08:51:53Z)
Software Testing for Machine Learning [13.021014899410684]
Machine learning has shown to be susceptible to deception, leading to errors and even fatal failures. This circumstance calls into question the widespread use of machine learning, especially in safety-critical applications. This summary talk discusses the current state-of-the-art of software testing for machine learning.
arXiv Detail & Related papers (2022-04-30T08:47:10Z)
Knowledge as Invariance -- History and Perspectives of Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point. Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks. This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z)
Challenges in Deploying Machine Learning: a Survey of Case Studies [11.028123436097616]
This survey reviews published reports of deploying machine learning solutions in a variety of use cases, industries and applications. By mapping found challenges to the steps of the machine learning deployment workflow we show that practitioners face issues at each stage of the deployment process.
arXiv Detail & Related papers (2020-11-18T16:20:28Z)
Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology [53.063411515511056]
We propose a process model for the development of machine learning applications. The first phase combines business and data understanding as data availability oftentimes affects the feasibility of the project. The sixth phase covers state-of-the-art approaches for monitoring and maintenance of a machine learning applications.
arXiv Detail & Related papers (2020-03-11T08:25:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.