Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary
Framework
- URL: http://arxiv.org/abs/2106.09121v1
- Date: Wed, 16 Jun 2021 20:50:59 GMT
- Title: Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary
Framework
- Authors: Jiahao Su, Wonmin Byeon, Furong Huang
- Abstract summary: We propose a theoretical framework for orthogonal convolutional layers.
Our framework endows high expressive power to various convolutional layers while maintaining their exactity.
Our layers are memory and computationally efficient for deep networks compared to previous designs.
- Score: 16.577482515547793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Enforcing orthogonality in neural networks is an antidote for gradient
vanishing/exploding problems, sensitivity by adversarial perturbation, and
bounding generalization errors. However, many previous approaches are
heuristic, and the orthogonality of convolutional layers is not systematically
studied: some of these designs are not exactly orthogonal, while others only
consider standard convolutional layers and propose specific classes of their
realizations. To address this problem, we propose a theoretical framework for
orthogonal convolutional layers, which establishes the equivalence between
various orthogonal convolutional layers in the spatial domain and the
paraunitary systems in the spectral domain. Since there exists a complete
spectral factorization of paraunitary systems, any orthogonal convolution layer
can be parameterized as convolutions of spatial filters. Our framework endows
high expressive power to various convolutional layers while maintaining their
exact orthogonality. Furthermore, our layers are memory and computationally
efficient for deep networks compared to previous designs. Our versatile
framework, for the first time, enables the study of architecture designs for
deep orthogonal networks, such as choices of skip connection, initialization,
stride, and dilation. Consequently, we scale up orthogonal networks to deep
architectures, including ResNet, WideResNet, and ShuffleNet, substantially
increasing the performance over the traditional shallow orthogonal networks.
Related papers
- Rotation Equivariant Proximal Operator for Deep Unfolding Methods in
Image Restoration [68.18203605110719]
We propose a high-accuracy rotation equivariant proximal network that embeds rotation symmetry priors into the deep unfolding framework.
This study makes efforts to suggest a high-accuracy rotation equivariant proximal network that effectively embeds rotation symmetry priors into the deep unfolding framework.
arXiv Detail & Related papers (2023-12-25T11:53:06Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Multilevel-in-Layer Training for Deep Neural Network Regression [1.6185544531149159]
We present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks.
We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer.
arXiv Detail & Related papers (2022-11-11T23:53:46Z) - Optimisation & Generalisation in Networks of Neurons [8.078758339149822]
The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks.
A new theoretical framework is proposed for deriving architecture-dependent first-order optimisation algorithms.
A new correspondence is proposed between ensembles of networks and individual networks.
arXiv Detail & Related papers (2022-10-18T18:58:40Z) - Orthogonalizing Convolutional Layers with the Cayley Transform [83.73855414030646]
We propose and evaluate an alternative approach to parameterize convolutional layers that are constrained to be orthogonal.
We show that our method indeed preserves orthogonality to a high degree even for large convolutions.
arXiv Detail & Related papers (2021-04-14T23:54:55Z) - Deep Networks from the Principle of Rate Reduction [32.87280757001462]
This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification.
We show that the basic iterative ascent gradient scheme for optimizing the rate reduction of learned features naturally leads to a multi-layer deep network, one iteration per layer.
All components of this "white box" network have precise optimization, statistical, and geometric interpretation.
arXiv Detail & Related papers (2020-10-27T06:01:43Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling.
We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category.
We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.