A Hybrid of Generative and Discriminative Models Based on the
Gaussian-coupled Softmax Layer
- URL: http://arxiv.org/abs/2305.05912v1
- Date: Wed, 10 May 2023 05:48:22 GMT
- Title: A Hybrid of Generative and Discriminative Models Based on the
Gaussian-coupled Softmax Layer
- Authors: Hideaki Hayashi
- Abstract summary: We propose a method to train a hybrid of discriminative and generative models in a single neural network.
We demonstrate that the proposed hybrid model can be applied to semi-supervised learning and confidence calibration.
- Score: 5.33024001730262
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Generative models have advantageous characteristics for classification tasks
such as the availability of unsupervised data and calibrated confidence,
whereas discriminative models have advantages in terms of the simplicity of
their model structures and learning algorithms and their ability to outperform
their generative counterparts. In this paper, we propose a method to train a
hybrid of discriminative and generative models in a single neural network (NN),
which exhibits the characteristics of both models. The key idea is the
Gaussian-coupled softmax layer, which is a fully connected layer with a softmax
activation function coupled with Gaussian distributions. This layer can be
embedded into an NN-based classifier and allows the classifier to estimate both
the class posterior distribution and the class-conditional data distribution.
We demonstrate that the proposed hybrid model can be applied to semi-supervised
learning and confidence calibration.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - ClusterDDPM: An EM clustering framework with Denoising Diffusion
Probabilistic Models [9.91610928326645]
Denoising diffusion probabilistic models (DDPMs) represent a new and promising class of generative models.
In this study, we introduce an innovative expectation-maximization (EM) framework for clustering using DDPMs.
In the M-step, our focus lies in learning clustering-friendly latent representations for the data by employing the conditional DDPM and matching the distribution of latent representations to the mixture of Gaussian priors.
arXiv Detail & Related papers (2023-12-13T10:04:06Z) - Generative Marginalization Models [21.971818180264943]
marginalization models (MAMs) are a new family of generative models for high-dimensional discrete data.
They offer scalable and flexible generative modeling by explicitly modeling all induced marginal distributions.
For energy-based training tasks, MAMs enable any-order generative modeling of high-dimensional problems beyond the scale of previous methods.
arXiv Detail & Related papers (2023-10-19T17:14:29Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - Normalizing Flow based Hidden Markov Models for Classification of Speech
Phones with Explainability [25.543231171094384]
In pursuit of explainability, we develop generative models for sequential data.
We combine modern neural networks (normalizing flows) and traditional generative models (hidden Markov models - HMMs)
The proposed generative models can compute likelihood of a data and hence directly suitable for maximum-likelihood (ML) classification approach.
arXiv Detail & Related papers (2021-07-01T20:10:55Z) - Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward.
For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z) - Generative Max-Mahalanobis Classifiers for Image Classification,
Generation and More [6.89001867562902]
Max-Mahalanobis (MMC) can be trained discriminatively, generatively, or jointly for image classification and generation.
We show that our Generative MMC (GMMC) can be trained discriminatively, generatively, or jointly for image classification and generation.
arXiv Detail & Related papers (2021-01-01T00:42:04Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.