Related papers: Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

URL: http://arxiv.org/abs/2310.06112v2
Date: Sun, 4 Feb 2024 16:31:50 GMT
Title: Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach
Authors: Shaopeng Fu, Di Wang
Abstract summary: Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs) We non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. For squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon.
Score: 8.994430921243767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.

Related papers

Harnessing Neuron Stability to Improve DNN Verification [42.65507402735545]
We present VeriStable, a novel extension of recently proposed DPLL-based constraint DNN verification approach. We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feed networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets) Preliminary results show that VeriStable is competitive and outperforms state-of-the-art verification tools, including $alpha$-$beta$-CROWN and MN-BaB, the first and second performers of the VNN-COMP, respectively.
arXiv Detail & Related papers (2024-01-19T23:48:04Z)
OccRob: Efficient SMT-Based Occlusion Robustness Verification of Deep Neural Networks [7.797299214812479]
Occlusion is a prevalent and easily realizable semantic perturbation to deep neural networks (DNNs) It can fool a DNN into misclassifying an input image by occluding some segments, possibly resulting in severe errors. Most existing robustness verification approaches for DNNs are focused on non-semantic perturbations.
arXiv Detail & Related papers (2023-01-27T18:54:00Z)
Quantum-Inspired Tensor Neural Networks for Option Pricing [4.3942901219301564]
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to control for industrial applications.
arXiv Detail & Related papers (2022-12-28T19:39:55Z)
Quantum-Inspired Tensor Neural Networks for Partial Differential Equations [5.963563752404561]
Deep learning methods are constrained by training time and memory. To tackle these shortcomings, we implement Neural Networks (TNN) We demonstrate that TNN provide significant parameter savings while attaining the same accuracy as compared to the classical Neural Network (DNN)
arXiv Detail & Related papers (2022-08-03T17:41:11Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Fast Axiomatic Attribution for Neural Networks [44.527672563424545]
Recent approaches include priors on the feature attribution of a deep neural network (DNN) into the training process to reduce the dependence on unwanted features. We consider a special class of efficiently axiomatically attributable DNNs for which an axiomatic feature attribution can be computed with only a single forward/backward pass. Various experiments demonstrate the advantages of $mathcalX$-DNNs, beating state-of-the-art generic attribution methods on regular DNNs for training with attribution priors.
arXiv Detail & Related papers (2021-11-15T10:51:01Z)
Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks [98.21130211336964]
Deep neural networks (DNNs) are known to be vulnerable to adversarial attacks. In this paper, we investigate the impact of network width and depth on the robustness of adversarially trained DNNs.
arXiv Detail & Related papers (2021-10-07T23:13:33Z)
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. We study the dynamics of GNNs by studying deep skip optimization. Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z)
Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role. We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z)
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.