Uncertainty Regularized Evidential Regression
- URL: http://arxiv.org/abs/2401.01484v1
- Date: Wed, 3 Jan 2024 01:18:18 GMT
- Title: Uncertainty Regularized Evidential Regression
- Authors: Kai Ye, Tiejin Chen, Hua Wei, Liang Zhan
- Abstract summary: The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory.
Specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance.
This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it.
- Score: 5.874234972285304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Evidential Regression Network (ERN) represents a novel approach that
integrates deep learning with Dempster-Shafer's theory to predict a target and
quantify the associated uncertainty. Guided by the underlying theory, specific
activation functions must be employed to enforce non-negative values, which is
a constraint that compromises model performance by limiting its ability to
learn from all samples. This paper provides a theoretical analysis of this
limitation and introduces an improvement to overcome it. Initially, we define
the region where the models can't effectively learn from the samples. Following
this, we thoroughly analyze the ERN and investigate this constraint. Leveraging
the insights from our analysis, we address the limitation by introducing a
novel regularization term that empowers the ERN to learn from the whole
training set. Our extensive experiments substantiate our theoretical findings
and demonstrate the effectiveness of the proposed solution.
Related papers
- Rethinking State Disentanglement in Causal Reinforcement Learning [78.12976579620165]
Causality provides rigorous theoretical support for ensuring that the underlying states can be uniquely recovered through identifiability.
We revisit this research line and find that incorporating RL-specific context can reduce unnecessary assumptions in previous identifiability analyses for latent states.
We propose a novel approach for general partially observable Markov Decision Processes (POMDPs) by replacing the complicated structural constraints in previous methods with two simple constraints for transition and reward preservation.
arXiv Detail & Related papers (2024-08-24T06:49:13Z) - Empirical Tests of Optimization Assumptions in Deep Learning [41.05664717242051]
This paper develops new empirical metrics to track the key quantities that must be controlled in theoretical analysis.
All of our tested assumptions fail to reliably capture optimization performance.
This highlights a need for new empirical verification of analytical assumptions used in theoretical analysis.
arXiv Detail & Related papers (2024-07-01T21:56:54Z) - Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis [0.66567375919026]
We propose a theory based on generalization bounds that provides such guarantees.
By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss.
We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
arXiv Detail & Related papers (2024-05-15T17:17:27Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - Learn to Accumulate Evidence from All Training Samples: Theory and
Practice [7.257751371276488]
Evidential deep learning offers a principled and computationally efficient way to turn a deterministic neural network uncertainty-aware.
Existing evidential activation functions create zero evidence regions, which prevent the model to learn from training samples falling into such regions.
A deeper analysis of evidential activation functions based on our theoretical underpinning inspires the design of a novel regularizer.
arXiv Detail & Related papers (2023-06-19T18:27:12Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Task-Free Continual Learning via Online Discrepancy Distance Learning [11.540150938141034]
This paper develops a new theoretical analysis framework which provides generalization bounds based on the discrepancy distance between the visited samples and the entire information made available for training the model.
Inspired by this theoretical model, we propose a new approach enabled by the dynamic component expansion mechanism for a mixture model, namely the Online Discrepancy Distance Learning (ODDL)
arXiv Detail & Related papers (2022-10-12T20:44:09Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Excess risk analysis for epistemic uncertainty with application to
variational inference [110.4676591819618]
We present a novel EU analysis in the frequentist setting, where data is generated from an unknown distribution.
We show a relation between the generalization ability and the widely used EU measurements, such as the variance and entropy of the predictive distribution.
We propose new variational inference that directly controls the prediction and EU evaluation performances based on the PAC-Bayesian theory.
arXiv Detail & Related papers (2022-06-02T12:12:24Z) - Can convolutional ResNets approximately preserve input distances? A
frequency analysis perspective [31.897568775099558]
We show that the theoretical link between the regularisation scheme used and bi-Lipschitzness is only valid under conditions which do not hold in practice.
We present a simple constructive algorithm to search for counter examples to the distance preservation condition.
arXiv Detail & Related papers (2021-06-04T13:12:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.