Can convolutional ResNets approximately preserve input distances? A
frequency analysis perspective
- URL: http://arxiv.org/abs/2106.02469v1
- Date: Fri, 4 Jun 2021 13:12:42 GMT
- Title: Can convolutional ResNets approximately preserve input distances? A
frequency analysis perspective
- Authors: Lewis Smith, Joost van Amersfoort, Haiwen Huang, Stephen Roberts,
Yarin Gal
- Abstract summary: We show that the theoretical link between the regularisation scheme used and bi-Lipschitzness is only valid under conditions which do not hold in practice.
We present a simple constructive algorithm to search for counter examples to the distance preservation condition.
- Score: 31.897568775099558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ResNets constrained to be bi-Lipschitz, that is, approximately distance
preserving, have been a crucial component of recently proposed techniques for
deterministic uncertainty quantification in neural models. We show that
theoretical justifications for recent regularisation schemes trying to enforce
such a constraint suffer from a crucial flaw -- the theoretical link between
the regularisation scheme used and bi-Lipschitzness is only valid under
conditions which do not hold in practice, rendering existing theory of limited
use, despite the strong empirical performance of these models. We provide a
theoretical explanation for the effectiveness of these regularisation schemes
using a frequency analysis perspective, showing that under mild conditions
these schemes will enforce a lower Lipschitz bound on the low-frequency
projection of images. We then provide empirical evidence supporting our
theoretical claims, and perform further experiments which demonstrate that our
broader conclusions appear to hold when some of the mathematical assumptions of
our proof are relaxed, corresponding to the setup used in prior work. In
addition, we present a simple constructive algorithm to search for counter
examples to the distance preservation condition, and discuss possible
implications of our theory for future model design.
Related papers
- Rethinking State Disentanglement in Causal Reinforcement Learning [78.12976579620165]
Causality provides rigorous theoretical support for ensuring that the underlying states can be uniquely recovered through identifiability.
We revisit this research line and find that incorporating RL-specific context can reduce unnecessary assumptions in previous identifiability analyses for latent states.
We propose a novel approach for general partially observable Markov Decision Processes (POMDPs) by replacing the complicated structural constraints in previous methods with two simple constraints for transition and reward preservation.
arXiv Detail & Related papers (2024-08-24T06:49:13Z) - Empirical Tests of Optimization Assumptions in Deep Learning [41.05664717242051]
This paper develops new empirical metrics to track the key quantities that must be controlled in theoretical analysis.
All of our tested assumptions fail to reliably capture optimization performance.
This highlights a need for new empirical verification of analytical assumptions used in theoretical analysis.
arXiv Detail & Related papers (2024-07-01T21:56:54Z) - General bounds on the quality of Bayesian coresets [13.497835690074151]
This work presents general upper and lower bounds on the Kullback-Leibler (KL)
Lower bounds are applied to obtain fundamental limitations on the quality of coreset approximations.
The upper bounds are used to analyze the performance of recent subsample-optimize methods.
arXiv Detail & Related papers (2024-05-20T04:46:14Z) - Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis [0.66567375919026]
We propose a theory based on generalization bounds that provides such guarantees.
By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss.
We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
arXiv Detail & Related papers (2024-05-15T17:17:27Z) - Uncertainty Regularized Evidential Regression [5.874234972285304]
The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory.
Specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance.
This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it.
arXiv Detail & Related papers (2024-01-03T01:18:18Z) - Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models [15.817239008727789]
In this work, we analyze a specific type of causal query called domain counterfactuals, which hypothesizes what a sample would have looked like if it had been generated in a different domain.
We show that recovering the latent Structural Causal Model (SCM) is unnecessary for estimating domain counterfactuals.
We also develop a theoretically grounded practical algorithm that simplifies the modeling process to generative model estimation.
arXiv Detail & Related papers (2023-06-20T04:19:06Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Deep Grey-Box Modeling With Adaptive Data-Driven Models Toward
Trustworthy Estimation of Theory-Driven Models [88.63781315038824]
We present a framework that enables us to analyze a regularizer's behavior empirically with a slight change in the neural net's architecture and the training objective.
arXiv Detail & Related papers (2022-10-24T10:42:26Z) - Log-linear Guardedness and its Implications [116.87322784046926]
Methods for erasing human-interpretable concepts from neural representations that assume linearity have been found to be tractable and useful.
This work formally defines the notion of log-linear guardedness as the inability of an adversary to predict the concept directly from the representation.
We show that, in the binary case, under certain assumptions, a downstream log-linear model cannot recover the erased concept.
arXiv Detail & Related papers (2022-10-18T17:30:02Z) - On the Minimal Adversarial Perturbation for Deep Neural Networks with
Provable Estimation Error [65.51757376525798]
The existence of adversarial perturbations has opened an interesting research line on provable robustness.
No provable results have been presented to estimate and bound the error committed.
This paper proposes two lightweight strategies to find the minimal adversarial perturbation.
The obtained results show that the proposed strategies approximate the theoretical distance and robustness for samples close to the classification, leading to provable guarantees against any adversarial attacks.
arXiv Detail & Related papers (2022-01-04T16:40:03Z) - Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated"
We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.