A Survey of Machine Learning Methods and Challenges for Windows Malware
Classification
- URL: http://arxiv.org/abs/2006.09271v2
- Date: Sun, 15 Nov 2020 16:35:36 GMT
- Title: A Survey of Machine Learning Methods and Challenges for Windows Malware
Classification
- Authors: Edward Raff, Charles Nicholas
- Abstract summary: Survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.
- Score: 43.4550536920809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Malware classification is a difficult problem, to which machine learning
methods have been applied for decades. Yet progress has often been slow, in
part due to a number of unique difficulties with the task that occur through
all stages of the developing a machine learning system: data collection,
labeling, feature creation and selection, model selection, and evaluation. In
this survey we will review a number of the current methods and challenges
related to malware classification, including data collection, feature
extraction, and model construction, and evaluation. Our discussion will include
thoughts on the constraints that must be considered for machine learning based
solutions in this domain, and yet to be tackled problems for which machine
learning could also provide a solution. This survey aims to be useful both to
cybersecurity practitioners who wish to learn more about how machine learning
can be applied to the malware problem, and to give data scientists the
necessary background into the challenges in this uniquely complicated space.
Related papers
- Machine Unlearning for Traditional Models and Large Language Models: A Short Survey [11.539080008361662]
Machine unlearning aims to delete data and reduce its impact on models according to user requests.
This paper categorizes and investigates unlearning on both traditional models and Large Language Models (LLMs)
arXiv Detail & Related papers (2024-04-01T16:08:18Z) - Machine Unlearning: A Survey [56.79152190680552]
A special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning.
This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality.
No study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios.
The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.
arXiv Detail & Related papers (2023-06-06T10:18:36Z) - Learnware: Small Models Do Big [69.88234743773113]
The prevailing big model paradigm, which has achieved impressive results in natural language processing and computer vision applications, has not yet addressed those issues, whereas becoming a serious source of carbon emissions.
This article offers an overview of the learnware paradigm, which attempts to enable users not need to build machine learning models from scratch, with the hope of reusing small models to do things even beyond their original purposes.
arXiv Detail & Related papers (2022-10-07T15:55:52Z) - Deep learning and machine learning for Malaria detection: overview,
challenges and future directions [0.0]
This study uses a variety of machine learning and image processing approaches to detect and forecast the malarial illness.
In our research, we discovered the potential of deep learning techniques as smart tools with broader applicability for malaria detection.
arXiv Detail & Related papers (2022-09-27T10:33:00Z) - A Survey of Machine Unlearning [56.017968863854186]
Recent regulations now require that, on request, private information about a user must be removed from computer systems.
ML models often remember' the old data.
Recent works on machine unlearning have not been able to completely solve the problem.
arXiv Detail & Related papers (2022-09-06T08:51:53Z) - Software Testing for Machine Learning [13.021014899410684]
Machine learning has shown to be susceptible to deception, leading to errors and even fatal failures.
This circumstance calls into question the widespread use of machine learning, especially in safety-critical applications.
This summary talk discusses the current state-of-the-art of software testing for machine learning.
arXiv Detail & Related papers (2022-04-30T08:47:10Z) - Knowledge as Invariance -- History and Perspectives of
Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point.
Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks.
This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z) - Challenges in Deploying Machine Learning: a Survey of Case Studies [11.028123436097616]
This survey reviews published reports of deploying machine learning solutions in a variety of use cases, industries and applications.
By mapping found challenges to the steps of the machine learning deployment workflow we show that practitioners face issues at each stage of the deployment process.
arXiv Detail & Related papers (2020-11-18T16:20:28Z) - Towards CRISP-ML(Q): A Machine Learning Process Model with Quality
Assurance Methodology [53.063411515511056]
We propose a process model for the development of machine learning applications.
The first phase combines business and data understanding as data availability oftentimes affects the feasibility of the project.
The sixth phase covers state-of-the-art approaches for monitoring and maintenance of a machine learning applications.
arXiv Detail & Related papers (2020-03-11T08:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.