Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks
with Soft-Thresholding
- URL: http://arxiv.org/abs/2304.06959v1
- Date: Fri, 14 Apr 2023 07:04:07 GMT
- Title: Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks
with Soft-Thresholding
- Authors: Chunyan Xiong, Mengli Lu, Xiaotong Yu, Jian Cao, Zhong Chen, Di Guo,
and Xiaobo Qu
- Abstract summary: Soft-thresholding has been widely used in neural networks.
A new way to convexify soft neuraling networks is presented.
This work provides a new way to convexify soft neuraling networks.
- Score: 15.514556290714053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Soft-thresholding has been widely used in neural networks. Its basic network
structure is a two-layer convolution neural network with soft-thresholding. Due
to the network's nature of nonlinearity and nonconvexity, the training process
heavily depends on an appropriate initialization of network parameters,
resulting in the difficulty of obtaining a globally optimal solution. To
address this issue, a convex dual network is designed here. We theoretically
analyze the network convexity and numerically confirm that the strong duality
holds. This conclusion is further verified in the linear fitting and denoising
experiments. This work provides a new way to convexify soft-thresholding neural
networks.
Related papers
- Computable Lipschitz Bounds for Deep Neural Networks [0.0]
We analyse three existing upper bounds written for the $l2$ norm.
We propose two novel bounds for both feed-forward fully-connected neural networks and convolutional neural networks.
arXiv Detail & Related papers (2024-10-28T14:09:46Z) - Compositional Curvature Bounds for Deep Neural Networks [7.373617024876726]
A key challenge that threatens the widespread use of neural networks in safety-critical applications is their vulnerability to adversarial attacks.
We study the second-order behavior of continuously differentiable deep neural networks, focusing on robustness against adversarial perturbations.
We introduce a novel algorithm to analytically compute provable upper bounds on the second derivative of neural networks.
arXiv Detail & Related papers (2024-06-07T17:50:15Z) - Convex neural network synthesis for robustness in the 1-norm [0.0]
This paper proposes a method to generate an approximation of a neural network which is certifiably more robust.
An application to robustifying model predictive control is used to demonstrate the results.
arXiv Detail & Related papers (2024-05-29T12:17:09Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
Settings and its Superiority to Kernel Methods [58.44819696433327]
We investigate the risk of two-layer ReLU neural networks in a teacher regression model.
We find that the student network provably outperforms any solution methods.
arXiv Detail & Related papers (2022-05-30T02:51:36Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Fast Adaptation with Linearized Neural Networks [35.43406281230279]
We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions.
Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network.
In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation.
arXiv Detail & Related papers (2021-03-02T03:23:03Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Convex Regularization Behind Neural Reconstruction [21.369208659395042]
This paper advocates a convex duality framework to make neural networks amenable to convex solvers.
Experiments with MNIST fastMRI datasets confirm the efficacy of the dual network optimization problem.
arXiv Detail & Related papers (2020-12-09T16:57:16Z) - The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural
Networks: an Exact Characterization of the Optimal Solutions [51.60996023961886]
We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints.
Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces.
arXiv Detail & Related papers (2020-06-10T15:38:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.