Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown
- URL: http://arxiv.org/abs/2308.02293v4
- Date: Tue, 13 May 2025 00:35:11 GMT
- Title: Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown
- Authors: Akifumi Okuno, Shotaro Yagishita,
- Abstract summary: We tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks.<n>Our approach integrates two key components: (1) a transformed trimmed loss (TTL), and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function.
- Score: 2.5628953713168685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks. Our approach integrates two key components: (1) a transformed trimmed loss (TTL), a computationally efficient variant of the classical trimmed loss, and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function. While traditional robust statistics typically assume low-complexity models such as linear and kernel models, applying TTL alone to modern neural networks may fail to ensure robustness, as their high expressive power allows them to fit both inliers and outliers, even when a robust loss is used. To address this, we revisit the traditional notion of breakdown point and adapt it to the nonlinear function setting, introducing a regularization scheme via HOVR that controls the model's capacity and suppresses overfitting to outliers. We theoretically establish that our training procedure retains a high functional breakdown point, thereby ensuring robustness to outlier contamination. We develop a stochastic optimization algorithm tailored to this framework and provide a theoretical guarantee of its convergence.
Related papers
- Deep Equilibrium models for Poisson Imaging Inverse problems via Mirror Descent [7.248102801711294]
Deep Equilibrium Models (DEQs) are implicit neural networks with fixed points.<n>We introduce a novel DEQ formulation based on Mirror Descent defined in terms of a tailored non-Euclidean geometry.<n>We propose computational strategies that enable both efficient training and fully parameter-free inference.
arXiv Detail & Related papers (2025-07-15T16:33:01Z) - Robust and Computation-Aware Gaussian Processes [18.264598332579748]
We introduce Robust Computation-aware Gaussian Process (RCaGP), a novel GP model that combines a principled treatment of approximation-induced uncertainty with robust generalized Bayesian updating.<n>Our model ensures more conservative and reliable uncertainty estimates, a property we rigorously demonstrate.<n> Empirical results confirm that solving these challenges jointly leads to superior performance across both clean and outlier-contaminated settings.
arXiv Detail & Related papers (2025-05-27T12:49:14Z) - Imitation Learning of MPC with Neural Networks: Error Guarantees and Sparsification [5.260346080244568]
We present a framework for bounding the approximation error in imitation model predictive controllers utilizing neural networks.
We discuss how this method can be used to design a stable neural network controller with performance guarantees.
arXiv Detail & Related papers (2025-01-07T10:18:37Z) - Outlier-Robust Training of Machine Learning Models [21.352210662488112]
We propose an Adaptive Alternation Algorithm for training machine learning models with outliers.
The algorithm iteratively trains the model by using a weighted version of the non-robust loss, while updating the weights at each.
Considering arbitrary outliers (i.e., with no distributional assumption on the outliers), we show that the use of robust loss kernels sigma increases the region of convergence.
arXiv Detail & Related papers (2024-12-31T04:19:53Z) - Regularization for Adversarial Robust Learning [18.46110328123008]
We develop a novel approach to adversarial training that integrates $phi$-divergence regularization into the distributionally robust risk function.
This regularization brings a notable improvement in computation compared with the original formulation.
We validate our proposed method in supervised learning, reinforcement learning, and contextual learning and showcase its state-of-the-art performance against various adversarial attacks.
arXiv Detail & Related papers (2024-08-19T03:15:41Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks [14.139047596566485]
HERTA is a high-efficiency and rigorous training algorithm for Unfolded GNNs.
HERTA converges to the optimum of the original model, thus preserving the interpretability of Unfolded GNNs.
As a byproduct of HERTA, we propose a new spectral sparsification method applicable to normalized and regularized graph Laplacians.
arXiv Detail & Related papers (2024-03-26T23:03:06Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Achieving Constraints in Neural Networks: A Stochastic Augmented
Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting.
We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem.
We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors.
LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task.
Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Nonasymptotic theory for two-layer neural networks: Beyond the
bias-variance trade-off [10.182922771556742]
We present a nonasymptotic generalization theory for two-layer neural networks with ReLU activation function.
We show that overparametrized random feature models suffer from the curse of dimensionality and thus are suboptimal.
arXiv Detail & Related papers (2021-06-09T03:52:18Z) - Bridging the Gap Between Adversarial Robustness and Optimization Bias [28.56135898767349]
Adrial robustness is an open challenge in deep learning, most often tackled using adversarial training.
We show that it is possible to achieve both perfect standard accuracy and a certain degree of robustness without a trade-off.
In particular, we characterize the robustness of linear convolutional models, showing that they resist attacks subject to a constraint on the Fourier-$ell_infty$ norm.
arXiv Detail & Related papers (2021-02-17T16:58:04Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - A Dynamical View on Optimization Algorithms of Overparameterized Neural
Networks [23.038631072178735]
We consider a broad class of optimization algorithms that are commonly used in practice.
As a consequence, we can leverage the convergence behavior of neural networks.
We believe our approach can also be extended to other optimization algorithms and network theory.
arXiv Detail & Related papers (2020-10-25T17:10:22Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.