Feature Space Particle Inference for Neural Network Ensembles
- URL: http://arxiv.org/abs/2206.00944v1
- Date: Thu, 2 Jun 2022 09:16:26 GMT
- Title: Feature Space Particle Inference for Neural Network Ensembles
- Authors: Shingo Yashima, Teppei Suzuki, Kohta Ishikawa, Ikuro Sato, Rei
Kawakami
- Abstract summary: Particle-based inference methods offer a promising approach from a Bayesian perspective.
We propose optimizing particles in the feature space where the activation of a specific intermediate layer lies.
Our method encourages each member to capture distinct features, which is expected to improve ensemble prediction robustness.
- Score: 13.392254060510666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembles of deep neural networks demonstrate improved performance over
single models. For enhancing the diversity of ensemble members while keeping
their performance, particle-based inference methods offer a promising approach
from a Bayesian perspective. However, the best way to apply these methods to
neural networks is still unclear: seeking samples from the weight-space
posterior suffers from inefficiency due to the over-parameterization issues,
while seeking samples directly from the function-space posterior often results
in serious underfitting. In this study, we propose optimizing particles in the
feature space where the activation of a specific intermediate layer lies to
address the above-mentioned difficulties. Our method encourages each member to
capture distinct features, which is expected to improve ensemble prediction
robustness. Extensive evaluation on real-world datasets shows that our model
significantly outperforms the gold-standard Deep Ensembles on various metrics,
including accuracy, calibration, and robustness. Code is available at
https://github.com/DensoITLab/featurePI .
Related papers
- Function Space Bayesian Pseudocoreset for Bayesian Neural Networks [16.952160718249292]
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset.
In this paper, we propose a novel Bayesian pseudocoreset construction method that operates on a function space.
By working directly on the function space, our method could bypass several challenges that may arise when working on a weight space.
arXiv Detail & Related papers (2023-10-27T02:04:31Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Actively learning a Bayesian matrix fusion model with deep side
information [1.421397337947365]
High-dimensional deep neural network representations of images and concepts can be aligned to predict human annotations of diverse stimuli.
We propose an active learning approach to adaptively sampling experimental stimuli.
We observe a significant efficiency gain over a passive baseline.
arXiv Detail & Related papers (2023-06-08T16:31:47Z) - Black-box Coreset Variational Inference [13.892427580424444]
We present a black-box variational inference framework for coresets to enable principled application of variational coresets to intractable models.
We apply our techniques to supervised learning problems, and compare them with existing approaches in the literature for data summarization and inference.
arXiv Detail & Related papers (2022-11-04T11:12:09Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - Ensembles of Spiking Neural Networks [0.3007949058551534]
This paper demonstrates how to construct ensembles of spiking neural networks producing state-of-the-art results.
We achieve classification accuracies of 98.71%, 100.0%, and 99.09%, on the MNIST, NMNIST and DVS Gesture datasets respectively.
We formalize spiking neural networks as GLM predictors, identifying a suitable representation for their target domain.
arXiv Detail & Related papers (2020-10-15T17:45:18Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Beyond Dropout: Feature Map Distortion to Regularize Deep Neural
Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks.
We propose a feature distortion method (Disout) for addressing the aforementioned problem.
The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.