Elliptic Loss Regularization
- URL: http://arxiv.org/abs/2503.02138v1
- Date: Tue, 04 Mar 2025 00:08:08 GMT
- Title: Elliptic Loss Regularization
- Authors: Ali Hasan, Haoming Yang, Yuting Ng, Vahid Tarokh,
- Abstract summary: We propose a technique for enforcing a level of smoothness in the mapping between the data input space and the loss value.<n>We specify the level of regularity by requiring that the loss of the network satisfies an elliptic operator over the data domain.
- Score: 24.24785205800212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Regularizing neural networks is important for anticipating model behavior in regions of the data space that are not well represented. In this work, we propose a regularization technique for enforcing a level of smoothness in the mapping between the data input space and the loss value. We specify the level of regularity by requiring that the loss of the network satisfies an elliptic operator over the data domain. To do this, we modify the usual empirical risk minimization objective such that we instead minimize a new objective that satisfies an elliptic operator over points within the domain. This allows us to use existing theory on elliptic operators to anticipate the behavior of the error for points outside the training set. We propose a tractable computational method that approximates the behavior of the elliptic operator while being computationally efficient. Finally, we analyze the properties of the proposed regularization to understand the performance on common problems of distribution shift and group imbalance. Numerical experiments confirm the utility of the proposed regularization technique.
Related papers
- Dimension reduction for derivative-informed operator learning: An analysis of approximation errors [3.7051887945349518]
We study the derivative-informed learning of nonlinear operators between infinite-dimensional separable Hilbert spaces by neural networks.
We analyze the approximation errors of neural operators in Sobolev norms over infinite-dimensional Gaussian input measures.
arXiv Detail & Related papers (2025-04-11T17:56:52Z) - Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution.
Our contribution is to provide the first general estimation technique for transportability problems.
We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z) - DeltaPhi: Learning Physical Trajectory Residual for PDE Solving [54.13671100638092]
We propose and formulate the Physical Trajectory Residual Learning (DeltaPhi)
We learn the surrogate model for the residual operator mapping based on existing neural operator networks.
We conclude that, compared to direct learning, physical residual learning is preferred for PDE solving.
arXiv Detail & Related papers (2024-06-14T07:45:07Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Regularization, early-stopping and dreaming: a Hopfield-like setup to
address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function.
Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z) - Safety Performance of Neural Networks in the Presence of Covariate Shift [0.0]
We propose to reshape the initial test set, as used for the safety performance evaluation prior to deployment, based on an approximation of the operational data.
This approximation is obtained by observing and learning the distribution of activation patterns of neurons in the network during operation.
arXiv Detail & Related papers (2023-07-24T11:55:32Z) - On the Identification and Optimization of Nonsmooth Superposition
Operators in Semilinear Elliptic PDEs [3.045851438458641]
We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation (PDE)
In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of $H leakyloc(mathbbR)$.
arXiv Detail & Related papers (2023-06-08T13:33:20Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.