Regularization via f-Divergence: An Application to Multi-Oxide Spectroscopic Analysis
- URL: http://arxiv.org/abs/2502.03755v1
- Date: Thu, 06 Feb 2025 03:37:35 GMT
- Title: Regularization via f-Divergence: An Application to Multi-Oxide Spectroscopic Analysis
- Authors: Weizhi Li, Natalie Klein, Brendan Gifford, Elizabeth Sklute, Carey Legett, Samuel Clegg,
- Abstract summary: We propose a novel regularization method based on f-divergence to constrain the distributional discrepancy between predictions and noisy targets.
We conduct experiments using spectra collected in a Mars-like environment by the remote-sensing instruments aboard the Curiosity and Perseverance rovers.
- Score: 0.9565934024763958
- License:
- Abstract: In this paper, we address the task of characterizing the chemical composition of planetary surfaces using convolutional neural networks (CNNs). Specifically, we seek to predict the multi-oxide weights of rock samples based on spectroscopic data collected under Martian conditions. We frame this problem as a multi-target regression task and propose a novel regularization method based on f-divergence. The f-divergence regularization is designed to constrain the distributional discrepancy between predictions and noisy targets. This regularizer serves a dual purpose: on the one hand, it mitigates overfitting by enforcing a constraint on the distributional difference between predictions and noisy targets. On the other hand, it acts as an auxiliary loss function, penalizing the neural network when the divergence between the predicted and target distributions becomes too large. To enable backpropagation during neural network training, we develop a differentiable f-divergence and incorporate it into the f-divergence regularization, making the network training feasible. We conduct experiments using spectra collected in a Mars-like environment by the remote-sensing instruments aboard the Curiosity and Perseverance rovers. Experimental results on multi-oxide weight prediction demonstrate that the proposed $f$-divergence regularization performs better than or comparable to standard regularization methods including $L_1$, $L_2$, and dropout. Notably, combining the $f$-divergence regularization with these standard regularization further enhances performance, outperforming each regularization method used independently.
Related papers
- Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Deep Anti-Regularized Ensembles provide reliable out-of-distribution
uncertainty quantification [4.750521042508541]
Deep ensemble often return overconfident estimates outside the training domain.
We show that an ensemble of networks with large weights fitting the training data are likely to meet these two objectives.
We derive a theoretical framework for this approach and show that the proposed optimization can be seen as a "water-filling" problem.
arXiv Detail & Related papers (2023-04-08T15:25:12Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - A heteroencoder architecture for prediction of failure locations in
porous metals using variational inference [1.2722697496405462]
We employ an encoder-decoder convolutional neural network to predict the failure locations of porous metal tension specimens.
The objective of predicting failure locations presents an extreme case of class imbalance since most of the material in the specimens do not fail.
We demonstrate that the resulting predicted variances are effective in ranking the locations that are most likely to fail in any given specimen.
arXiv Detail & Related papers (2022-01-31T20:26:53Z) - Regularization by Misclassification in ReLU Neural Networks [3.288086999241324]
We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label.
We show that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the firing pattern of the hidden layers is sparser.
arXiv Detail & Related papers (2021-11-03T11:42:38Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Shaping Deep Feature Space towards Gaussian Mixture for Visual
Classification [74.48695037007306]
We propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification.
With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution.
The proposed model can be implemented easily and efficiently without using extra trainable parameters.
arXiv Detail & Related papers (2020-11-18T03:32:27Z) - Learning Calibrated Uncertainties for Domain Shift: A Distributionally
Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts.
In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution.
We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z) - Investigating maximum likelihood based training of infinite mixtures for
uncertainty quantification [16.30200782698554]
We investigate the effect of training an infinite mixture distribution with the maximum likelihood method instead of variational inference.
We find that the proposed objective leads to adversarial networks with an increased predictive variance.
arXiv Detail & Related papers (2020-08-07T14:55:53Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.