Depthwise Separable Convolutions Allow for Fast and Memory-Efficient
Spectral Normalization
- URL: http://arxiv.org/abs/2102.06496v1
- Date: Fri, 12 Feb 2021 12:55:42 GMT
- Title: Depthwise Separable Convolutions Allow for Fast and Memory-Efficient
Spectral Normalization
- Authors: Christina Runkel, Christian Etmann, Michael M\"oller, Carola-Bibiane
Sch\"onlieb
- Abstract summary: We introduce a very simple method for spectral normalization of depthwise separable convolutions.
We demonstrate the effectiveness of our method on image classification tasks using standard architectures like MobileNetV2.
- Score: 1.1470070927586016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An increasing number of models require the control of the spectral norm of
convolutional layers of a neural network. While there is an abundance of
methods for estimating and enforcing upper bounds on those during training,
they are typically costly in either memory or time. In this work, we introduce
a very simple method for spectral normalization of depthwise separable
convolutions, which introduces negligible computational and memory overhead. We
demonstrate the effectiveness of our method on image classification tasks using
standard architectures like MobileNetV2.
Related papers
- LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers [0.0468732641979009]
We propose a layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees.
Our method Lip Kernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model.
We show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks.
arXiv Detail & Related papers (2024-10-29T17:20:14Z) - Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers [0.0]
Controlling the spectral norm of the Jacobian matrix has been shown to improve generalization, training stability and robustness in CNNs.
Existing methods for computing the norm either tend to overestimate it or their performance may deteriorate quickly with increasing the input and kernel sizes.
In this paper, we demonstrate that the tensor version of the spectral norm of a four-dimensional convolution kernel, up to a constant factor, serves as an upper bound for the spectral norm of the Jacobian matrix associated with the convolution operation.
arXiv Detail & Related papers (2024-09-18T10:28:28Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Loop Unrolled Shallow Equilibrium Regularizer (LUSER) -- A
Memory-Efficient Inverse Problem Solver [26.87738024952936]
In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted and often ill-posed measurements.
We propose an LU algorithm with shallow equilibrium regularizers (L)
These implicit models are as expressive as deeper convolutional networks, but far more memory efficient during training.
arXiv Detail & Related papers (2022-10-10T19:50:37Z) - Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies.
We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z) - Fast Approximate Spectral Normalization for Robust Deep Neural Networks [3.5027291542274357]
We introduce an approximate algorithm for spectral normalization based on Fourier transform and layer separation.
Our framework is able to significantly improve both time efficiency (up to 60%) and model robustness (61% on average) compared with the state-of-the-art spectral normalization.
arXiv Detail & Related papers (2021-03-22T15:35:45Z) - Applications of Koopman Mode Analysis to Neural Networks [52.77024349608834]
We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space.
We show how the Koopman spectrum can be used to determine the number of layers required for the architecture.
We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
arXiv Detail & Related papers (2020-06-21T11:00:04Z) - Optimization Theory for ReLU Neural Networks Trained with Normalization
Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers.
Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z) - Evolving Normalization-Activation Layers [100.82879448303805]
We develop efficient rejection protocols to quickly filter out candidate layers that do not work well.
Our method leads to the discovery of EvoNorms, a set of new normalization-activation layers with novel, and sometimes surprising structures.
Our experiments show that EvoNorms work well on image classification models including ResNets, MobileNets and EfficientNets.
arXiv Detail & Related papers (2020-04-06T19:52:48Z) - Learning Memory-guided Normality for Anomaly Detection [33.77435699029528]
We present an unsupervised learning approach to anomaly detection that considers the diversity of normal patterns explicitly.
We also present novel feature compactness and separateness losses to train the memory, boosting the discriminative power of both memory items and deeply learned features from normal data.
arXiv Detail & Related papers (2020-03-30T05:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.