Related papers: Achieving Constraints in Neural Networks: A Stochastic Augmented Lagrangian Approach

Achieving Constraints in Neural Networks: A Stochastic Augmented Lagrangian Approach

URL: http://arxiv.org/abs/2310.16647v1
Date: Wed, 25 Oct 2023 13:55:35 GMT
Title: Achieving Constraints in Neural Networks: A Stochastic Augmented Lagrangian Approach
Authors: Diogo Lavado, Cl\'audia Soares and Alessandra Micheletti
Abstract summary: Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting. We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem. We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
Score: 49.1574468325115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting. Fixed penalty methods, though common, lack adaptability and suffer from hyperparameter sensitivity. In this paper, we propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem. Where the data fidelity term is the minimization objective and the regularization terms serve as constraints. Then, we employ the Stochastic Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism. Our approach extends beyond black-box regularization, demonstrating significant improvements in white-box models, where weights are often subject to hard constraints to ensure interpretability. Experimental results on image-based classification on MNIST, CIFAR10, and CIFAR100 datasets validate the effectiveness of our approach. SAL consistently achieves higher Accuracy while also achieving better constraint satisfaction, thus showcasing its potential for optimizing DNNs under constrained settings.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
ENFORCE: Exact Nonlinear Constrained Learning with Adaptive-depth Neural Projection [0.0]
ENFORCE is a neural network architecture that guarantees predictions to satisfy nonlinear constraints exactly. We build an adaptive-depth neural projection module that dynamically adjusts its complexity to suit the specific problem and the required tolerance levels.
arXiv Detail & Related papers (2025-02-10T18:52:22Z)
A constrained optimization approach to improve robustness of neural networks [1.2338729811609355]
We present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining accuracy on clean data.
arXiv Detail & Related papers (2024-09-18T18:37:14Z)
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks [14.139047596566485]
HERTA is a high-efficiency and rigorous training algorithm for Unfolded GNNs. HERTA converges to the optimum of the original model, thus preserving the interpretability of Unfolded GNNs. As a byproduct of HERTA, we propose a new spectral sparsification method applicable to normalized and regularized graph Laplacians.
arXiv Detail & Related papers (2024-03-26T23:03:06Z)
Enhancing Reliability of Neural Networks at the Edge: Inverted Normalization with Stochastic Affine Transformations [0.22499166814992438]
We propose a method to inherently enhance the robustness and inference accuracy of BayNNs deployed in in-memory computing architectures. Empirical results show a graceful degradation in inference accuracy, with an improvement of up to $58.11%$.
arXiv Detail & Related papers (2024-01-23T00:27:31Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Better Training using Weight-Constrained Stochastic Dynamics [0.0]
We employ constraints to control the parameter space of deep neural networks throughout training. The use of customized, appropriately designed constraints can reduce the vanishing/exploding problem. We provide a general approach to efficiently incorporate constraints into a gradient Langevin framework.
arXiv Detail & Related papers (2021-06-20T14:41:06Z)
FISAR: Forward Invariant Safe Reinforcement Learning with a Deep Neural Network-Based Optimize [44.65622657676026]
We take constraints as Lyapunov functions and impose new linear constraints on the policy parameters' updating dynamics. Because the new guaranteed-feasible constraints are imposed on the updating dynamics instead of the original policy parameters, classic optimization algorithms are no longer applicable.
arXiv Detail & Related papers (2020-06-19T21:58:42Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.