Malware Classification with GMM-HMM Models
- URL: http://arxiv.org/abs/2103.02753v1
- Date: Wed, 3 Mar 2021 23:23:48 GMT
- Title: Malware Classification with GMM-HMM Models
- Authors: Jing Zhao and Samanvitha Basole and Mark Stamp
- Abstract summary: In this paper, we use GMM-HMMs for malware classification and we compare our results to those obtained using discrete HMMs.
For our opcode features, GMM-HMMs produce results that are comparable to those obtained using discrete HMMs.
- Score: 8.02151721194722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discrete hidden Markov models (HMM) are often applied to malware detection
and classification problems. However, the continuous analog of discrete HMMs,
that is, Gaussian mixture model-HMMs (GMM-HMM), are rarely considered in the
field of cybersecurity. In this paper, we use GMM-HMMs for malware
classification and we compare our results to those obtained using discrete
HMMs. As features, we consider opcode sequences and entropy-based sequences.
For our opcode features, GMM-HMMs produce results that are comparable to those
obtained using discrete HMMs, whereas for our entropy-based features, GMM-HMMs
generally improve significantly on the classification results that we have
achieved with discrete HMMs.
Related papers
- Synthetic Multimodal Question Generation [60.33494376081317]
Multimodal Retrieval Augmented Generation (MMRAG) is a powerful approach to question-answering over multimodal documents.
We propose SMMQG, a synthetic data generation framework that generates question and answer pairs directly from multimodal documents.
We use SMMQG to generate an MMRAG dataset of 1024 questions over Wikipedia documents and evaluate state-of-the-art models using it.
arXiv Detail & Related papers (2024-07-02T12:57:42Z) - Learning Hidden Markov Models Using Conditional Samples [72.20944611510198]
This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM)
In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
arXiv Detail & Related papers (2023-02-28T16:53:41Z) - Linear chain conditional random fields, hidden Markov models, and
related classifiers [4.984601297028258]
Conditional Random Fields (CRFs) are an alternative to Hidden Markov Models (HMMs)
We show that basic Linear-Chain CRFs (LC-CRFs) are in fact equivalent to them in the sense that for each LC-CRF there exists a HMM.
We show that it is possible to reformulate the generative Bayesian classifiers Maximum Posterior Mode (MPM) and Maximum a Posteriori (MAP) used in HMMs, as discriminative ones.
arXiv Detail & Related papers (2023-01-03T18:52:39Z) - Fuzzy Cognitive Maps and Hidden Markov Models: Comparative Analysis of
Efficiency within the Confines of the Time Series Classification Task [0.0]
We explore the application of Hidden Markov Model (HMM) for time series classification.
We identify four models, HMM NN (HMM, one per series), HMM 1C (HMM, one per class), FCM NN, and FCM 1C are then studied in a series of experiments.
arXiv Detail & Related papers (2022-04-28T12:41:05Z) - Learning Hidden Markov Models When the Locations of Missing Observations
are Unknown [54.40592050737724]
We consider the general problem of learning an HMM from data with unknown missing observation locations.
We provide reconstruction algorithms that do not require any assumptions about the structure of the underlying chain.
We show that under proper specifications one can reconstruct the process dynamics as well as if the missing observations positions were known.
arXiv Detail & Related papers (2022-03-12T22:40:43Z) - Image Modeling with Deep Convolutional Gaussian Mixture Models [79.0660895390689]
We present a new formulation of deep hierarchical Gaussian Mixture Models (GMMs) that is suitable for describing and generating images.
DCGMMs avoid this by a stacked architecture of multiple GMM layers, linked by convolution and pooling operations.
For generating sharp images with DCGMMs, we introduce a new gradient-based technique for sampling through non-invertible operations like convolution and pooling.
Based on the MNIST and FashionMNIST datasets, we validate the DCGMMs model by demonstrating its superiority over flat GMMs for clustering, sampling and outlier detection.
arXiv Detail & Related papers (2021-04-19T12:08:53Z) - Robust Classification using Hidden Markov Models and Mixtures of
Normalizing Flows [25.543231171094384]
We use a generative model that combines the state transitions of a hidden Markov model (HMM) and the neural network based probability distributions for the hidden states of the HMM.
We verify the improved robustness of NMM-HMM classifiers in an application to speech recognition.
arXiv Detail & Related papers (2021-02-15T00:40:30Z) - Indoor Group Activity Recognition using Multi-Layered HMMs [0.0]
Group Activities (GA) based on imagery data processing have significant applications in surveillance systems.
We propose Ontology GAR with a proper inference model that is capable of identifying and classifying a sequence of events in group activities.
A multi-layered Markov Model (HMM) is proposed to recognize different levels of abstract observations.
arXiv Detail & Related papers (2021-01-23T22:02:12Z) - DenseHMM: Learning Hidden Markov Models by Learning Dense
Representations [0.0]
We propose a modification of Hidden Markov Models (HMMs) that allows to learn dense representations of both the hidden states and the observables.
Compared to the standard HMM, transition probabilities are not atomic but composed of these representations via kernelization.
The properties of the DenseHMM like learned co-occurrences and log-likelihoods are studied empirically on synthetic and biomedical datasets.
arXiv Detail & Related papers (2020-12-17T17:48:27Z) - Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets.
We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z) - A Rigorous Link Between Self-Organizing Maps and Gaussian Mixture Models [78.6363825307044]
This work presents a mathematical treatment of the relation between Self-Organizing Maps (SOMs) and Gaussian Mixture Models (GMMs)
We show that energy-based SOM models can be interpreted as performing gradient descent.
This link allows to treat SOMs as generative probabilistic models, giving a formal justification for using SOMs to detect outliers, or for sampling.
arXiv Detail & Related papers (2020-09-24T14:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.