Related papers: L-Lipschitz Gershgorin ResNet Network

L-Lipschitz Gershgorin ResNet Network

URL: http://arxiv.org/abs/2502.21279v1
Date: Fri, 28 Feb 2025 17:57:57 GMT
Title: L-Lipschitz Gershgorin ResNet Network
Authors: Marius F. R. Juston, William R. Norris, Dustin Nottage, Ahmet Soylemezoglu,
Abstract summary: This paper uses a rigorous approach to design $mathcalL$-Lipschitz deep residual networks.<n>The ResNet architecture was reformulated as a pseudo-tri-diagonal LMI with off-diagonal elements.<n>To address the lack of explicit eigenvalue computations for such matrix structures, the Gershgorin circle theorem was employed.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep residual networks (ResNets) have demonstrated outstanding success in computer vision tasks, attributed to their ability to maintain gradient flow through deep architectures. Simultaneously, controlling the Lipschitz bound in neural networks has emerged as an essential area of research for enhancing adversarial robustness and network certifiability. This paper uses a rigorous approach to design $\mathcal{L}$-Lipschitz deep residual networks using a Linear Matrix Inequality (LMI) framework. The ResNet architecture was reformulated as a pseudo-tri-diagonal LMI with off-diagonal elements and derived closed-form constraints on network parameters to ensure $\mathcal{L}$-Lipschitz continuity. To address the lack of explicit eigenvalue computations for such matrix structures, the Gershgorin circle theorem was employed to approximate eigenvalue locations, guaranteeing the LMI's negative semi-definiteness. Our contributions include a provable parameterization methodology for constructing Lipschitz-constrained networks and a compositional framework for managing recursive systems within hierarchical architectures. These findings enable robust network designs applicable to adversarial robustness, certified training, and control systems. However, a limitation was identified in the Gershgorin-based approximations, which over-constrain the system, suppressing non-linear dynamics and diminishing the network's expressive capacity.

Related papers

LDLT $\mathcal{L}$-Lipschitz Network: Generalized Deep End-To-End Lipschitz Network Construction [3.744861320984297]
Controlling the Lipschitz constant in neural networks has emerged as an essential area of research to enhance adversarial robustness and network certifiability.<n>This paper presents a rigorous approach to the general design of $mathcalL$-Lipschitz deep residual networks using a Linear Matrix Inequality (LMI) framework.
arXiv Detail & Related papers (2025-12-05T17:51:08Z)
A Scalable Approach for Safe and Robust Learning via Lipschitz-Constrained Networks [2.8960888722909566]
Lipschitz-constrained global training constraints for neural networks (NNs) are proposed.<n>We show that the proposed formulation of Lipschitz-constrained NNs can be significantly improved.
arXiv Detail & Related papers (2025-06-30T15:42:23Z)
Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z)
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs. We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z)
Deep Linear Networks for Matrix Completion -- An Infinite Depth Limit [10.64241024049424]
The deep linear network (DLN) is a model for implicit regularization in gradient based optimization of overparametrized learning architectures. We investigate the link between the geometric geometry and the trainings for matrix completion with rigorous analysis and numerics. We propose that implicit regularization is a result of bias towards high state space volume.
arXiv Detail & Related papers (2022-10-22T17:03:10Z)
Robust Training and Verification of Implicit Neural Networks: A Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks. We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network. We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z)
Singular Value Perturbation and Deep Network Optimization [29.204852309828006]
We develop new theoretical results on matrix perturbation to shed light on the impact of architecture on the performance of a deep network. In particular, we explain what deep learning practitioners have long observed empirically: the parameters of some deep architectures are easier to optimize than others. A direct application of our perturbation results explains analytically why a ResNet is easier to optimize than a ConvNet.
arXiv Detail & Related papers (2022-03-07T02:09:39Z)
Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications. We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z)
Certifying Incremental Quadratic Constraints for Neural Networks via Convex Optimization [2.388501293246858]
We propose a convex program to certify incremental quadratic constraints on the map of neural networks over a region of interest. certificates can capture several useful properties such as (local) Lipschitz continuity, one-sided Lipschitz continuity, invertibility, and contraction.
arXiv Detail & Related papers (2020-12-10T21:15:00Z)
Kernel-Based Smoothness Analysis of Residual Networks [85.20737467304994]
Residual networks (ResNets) stand out among these powerful modern architectures. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoothers than gradients.
arXiv Detail & Related papers (2020-09-21T16:32:04Z)
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth [19.866928507243617]
Training deep neural networks with gradient descent (SGD) can often achieve zero training loss on real-world landscapes. We propose a new limit of infinity deep residual networks, which enjoys a good training in the sense that everyr is global.
arXiv Detail & Related papers (2020-03-11T20:14:47Z)
Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond [171.07853346630057]
Linear relaxation based perturbation analysis (LiRPA) for neural networks has become a core component in robustness verification and certified defense. We develop an automatic framework to enable perturbation analysis on any neural network structures. We demonstrate LiRPA based certified defense on Tiny ImageNet and Downscaled ImageNet.
arXiv Detail & Related papers (2020-02-28T18:47:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.