DS-UI: Dual-Supervised Mixture of Gaussian Mixture Models for
Uncertainty Inference
- URL: http://arxiv.org/abs/2011.08595v1
- Date: Tue, 17 Nov 2020 12:35:02 GMT
- Title: DS-UI: Dual-Supervised Mixture of Gaussian Mixture Models for
Uncertainty Inference
- Authors: Jiyang Xie and Zhanyu Ma and Jing-Hao Xue and Guoqiang Zhang and Jun
Guo
- Abstract summary: We propose a dual-supervised uncertainty inference (DS-UI) framework for improving Bayesian estimation-based uncertainty inference (UI) in deep neural network (DNN)-based image recognition.
In the DS-UI, we combine the last fully-connected (FC) layer with a mixture of Gaussian mixture models (MoGMM) to obtain an MoGMM-FC layer.
Experimental results show the DS-UI outperforms the state-of-the-art UI methods in misclassification detection.
- Score: 52.899219617256655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a dual-supervised uncertainty inference (DS-UI) framework
for improving Bayesian estimation-based uncertainty inference (UI) in deep
neural network (DNN)-based image recognition. In the DS-UI, we combine the
classifier of a DNN, i.e., the last fully-connected (FC) layer, with a mixture
of Gaussian mixture models (MoGMM) to obtain an MoGMM-FC layer. Unlike existing
UI methods for DNNs, which only calculate the means or modes of the DNN
outputs' distributions, the proposed MoGMM-FC layer acts as a probabilistic
interpreter for the features that are inputs of the classifier to directly
calculate the probability density of them for the DS-UI. In addition, we
propose a dual-supervised stochastic gradient-based variational Bayes (DS-SGVB)
algorithm for the MoGMM-FC layer optimization. Unlike conventional SGVB and
optimization algorithms in other UI methods, the DS-SGVB not only models the
samples in the specific class for each Gaussian mixture model (GMM) in the
MoGMM, but also considers the negative samples from other classes for the GMM
to reduce the intra-class distances and enlarge the inter-class margins
simultaneously for enhancing the learning ability of the MoGMM-FC layer in the
DS-UI. Experimental results show the DS-UI outperforms the state-of-the-art UI
methods in misclassification detection. We further evaluate the DS-UI in
open-set out-of-domain/-distribution detection and find statistically
significant improvements. Visualizations of the feature spaces demonstrate the
superiority of the DS-UI.
Related papers
- Novel Approach to Intrusion Detection: Introducing GAN-MSCNN-BILSTM with LIME Predictions [0.0]
This paper introduces an innovative intrusion detection system that harnesses Generative Adversarial Networks (GANs), Multi-Scale Convolutional Neural Networks (MSCNNs), and Bidirectional Long Short-Term Memory (BiLSTM) networks.
The system generates realistic network traffic data, encompassing both normal and attack patterns.
Evaluation on the Hogzilla dataset, a standard benchmark, showcases an impressive accuracy of 99.16% for multi-class classification and 99.10% for binary classification.
arXiv Detail & Related papers (2024-06-08T11:26:44Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion [56.38386580040991]
Consistency Trajectory Model (CTM) is a generalization of Consistency Models (CM)
CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance.
Unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods.
arXiv Detail & Related papers (2023-10-01T05:07:17Z) - G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima [17.473268736086137]
We propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models.
We introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix, which partition the training data into two subsets based on the sharpness-sensitivity of each example.
Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models.
arXiv Detail & Related papers (2023-08-07T01:25:10Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - Image Modeling with Deep Convolutional Gaussian Mixture Models [79.0660895390689]
We present a new formulation of deep hierarchical Gaussian Mixture Models (GMMs) that is suitable for describing and generating images.
DCGMMs avoid this by a stacked architecture of multiple GMM layers, linked by convolution and pooling operations.
For generating sharp images with DCGMMs, we introduce a new gradient-based technique for sampling through non-invertible operations like convolution and pooling.
Based on the MNIST and FashionMNIST datasets, we validate the DCGMMs model by demonstrating its superiority over flat GMMs for clustering, sampling and outlier detection.
arXiv Detail & Related papers (2021-04-19T12:08:53Z) - Margin-Based Regularization and Selective Sampling in Deep Neural
Networks [7.219077740523683]
We derive a new margin-based regularization formulation, termed multi-margin regularization (MMR) for deep neural networks (DNNs)
We show improved empirical results on CIFAR10, CIFAR100 and ImageNet using state-of-the-art convolutional neural networks (CNNs) and BERT-BASE architecture for the MNLI, QQP, QNLI, MRPC, SST-2 and RTE benchmarks.
arXiv Detail & Related papers (2020-09-13T15:06:42Z) - Exploring Gaussian mixture model framework for speaker adaptation of
deep neural network acoustic models [3.867363075280544]
We investigate the GMM-derived (GMMD) features for adaptation of deep neural network (DNN) acoustic models.
We explore fusion of the adapted GMMD features with conventional features, such as bottleneck and MFCC features, in two different neural network architectures.
arXiv Detail & Related papers (2020-03-15T18:56:19Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.