Related papers: Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability

Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability

URL: http://arxiv.org/abs/2107.00730v1
Date: Thu, 1 Jul 2021 20:10:55 GMT
Title: Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability
Authors: Anubhab Ghosh, Antoine Honor\'e, Dong Liu, Gustav Eje Henter, Saikat Chatterjee
Abstract summary: In pursuit of explainability, we develop generative models for sequential data. We combine modern neural networks (normalizing flows) and traditional generative models (hidden Markov models - HMMs) The proposed generative models can compute likelihood of a data and hence directly suitable for maximum-likelihood (ML) classification approach.
Score: 25.543231171094384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In pursuit of explainability, we develop generative models for sequential data. The proposed models provide state-of-the-art classification results and robust performance for speech phone classification. We combine modern neural networks (normalizing flows) and traditional generative models (hidden Markov models - HMMs). Normalizing flow-based mixture models (NMMs) are used to model the conditional probability distribution given the hidden state in the HMMs. Model parameters are learned through judicious combinations of time-tested Bayesian learning methods and contemporary neural network learning methods. We mainly combine expectation-maximization (EM) and mini-batch gradient descent. The proposed generative models can compute likelihood of a data and hence directly suitable for maximum-likelihood (ML) classification approach. Due to structural flexibility of HMMs, we can use different normalizing flow models. This leads to different types of HMMs providing diversity in data modeling capacity. The diversity provides an opportunity for easy decision fusion from different models. For a standard speech phone classification setup involving 39 phones (classes) and the TIMIT dataset, we show that the use of standard features called mel-frequency-cepstral-coeffcients (MFCCs), the proposed generative models, and the decision fusion together can achieve $86.6\%$ accuracy by generative training only. This result is close to state-of-the-art results, for examples, $86.2\%$ accuracy of PyTorch-Kaldi toolkit [1], and $85.1\%$ accuracy using light gated recurrent units [2]. We do not use any discriminative learning approach and related sophisticated features in this article.

Related papers

Model Assembly Learning with Heterogeneous Layer Weight Merging [57.8462476398611]
We introduce Model Assembly Learning (MAL), a novel paradigm for model merging. MAL integrates parameters from diverse models in an open-ended model zoo to enhance the base model's capabilities.
arXiv Detail & Related papers (2025-03-27T16:21:53Z)
Amortized Bayesian Mixture Models [1.3976439685325095]
This paper introduces a novel extension of Amortized Bayesian Inference (ABI) tailored to mixture models. We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks. The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing.
arXiv Detail & Related papers (2025-01-17T14:51:03Z)
EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z)
A Hybrid of Generative and Discriminative Models Based on the Gaussian-coupled Softmax Layer [5.33024001730262]
We propose a method to train a hybrid of discriminative and generative models in a single neural network. We demonstrate that the proposed hybrid model can be applied to semi-supervised learning and confidence calibration.
arXiv Detail & Related papers (2023-05-10T05:48:22Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
Training Structured Mechanical Models by Minimizing Discrete Euler-Lagrange Residual [36.52097893036073]
Structured Mechanical Models (SMMs) are a data-efficient black-box parameterization of mechanical systems. We propose a methodology for fitting SMMs to data by minimizing the discrete Euler-Lagrange residual. Experiments show that our methodology learns models that are better in accuracy to those of the conventional schemes for fitting SMMs.
arXiv Detail & Related papers (2021-05-05T00:44:01Z)
Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning. We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities. We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z)
Robust Classification using Hidden Markov Models and Mixtures of Normalizing Flows [25.543231171094384]
We use a generative model that combines the state transitions of a hidden Markov model (HMM) and the neural network based probability distributions for the hidden states of the HMM. We verify the improved robustness of NMM-HMM classifiers in an application to speech recognition.
arXiv Detail & Related papers (2021-02-15T00:40:30Z)
Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets. We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z)
Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z)
Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions. Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions. The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z)
Semi-nonparametric Latent Class Choice Model with a Flexible Class Membership Component: A Mixture Model Approach [6.509758931804479]
The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification. Results show that mixture models improve the overall performance of latent class choice models.
arXiv Detail & Related papers (2020-07-06T13:19:26Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.