Reframing Neural Networks: Deep Structure in Overcomplete
Representations
- URL: http://arxiv.org/abs/2103.05804v1
- Date: Wed, 10 Mar 2021 01:15:14 GMT
- Title: Reframing Neural Networks: Deep Structure in Overcomplete
Representations
- Authors: Calvin Murdock and Simon Lucey
- Abstract summary: We introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames.
We quantify structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability.
This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design.
- Score: 41.84502123663809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In comparison to classical shallow representation learning techniques, deep
neural networks have achieved superior performance in nearly every application
benchmark. But despite their clear empirical advantages, it is still not well
understood what makes them so effective. To approach this question, we
introduce deep frame approximation, a unifying framework for representation
learning with structured overcomplete frames. While exact inference requires
iterative optimization, it may be approximated by the operations of a
feed-forward deep neural network. We then indirectly analyze how model capacity
relates to the frame structure induced by architectural hyperparameters such as
depth, width, and skip connections. We quantify these structural differences
with the deep frame potential, a data-independent measure of coherence linked
to representation uniqueness and stability. As a criterion for model selection,
we show correlation with generalization error on a variety of common deep
network architectures such as ResNets and DenseNets. We also demonstrate how
recurrent networks implementing iterative optimization algorithms achieve
performance comparable to their feed-forward approximations. This connection to
the established theory of overcomplete representations suggests promising new
directions for principled deep network architecture design with less reliance
on ad-hoc engineering.
Related papers
- Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration [62.41329042683779]
We propose a high-accuracy rotation equivariant proximal network that embeds rotation symmetry priors into the deep unfolding framework.
This study makes efforts to suggest a high-accuracy rotation equivariant proximal network that effectively embeds rotation symmetry priors into the deep unfolding framework.
arXiv Detail & Related papers (2023-12-25T11:53:06Z) - Operator Learning Meets Numerical Analysis: Improving Neural Networks
through Iterative Methods [2.226971382808806]
We develop a theoretical framework grounded in iterative methods for operator equations.
We demonstrate that popular architectures, such as diffusion models and AlphaFold, inherently employ iterative operator learning.
Our work aims to enhance the understanding of deep learning by merging insights from numerical analysis.
arXiv Detail & Related papers (2023-10-02T20:25:36Z) - Optimisation & Generalisation in Networks of Neurons [8.078758339149822]
The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks.
A new theoretical framework is proposed for deriving architecture-dependent first-order optimisation algorithms.
A new correspondence is proposed between ensembles of networks and individual networks.
arXiv Detail & Related papers (2022-10-18T18:58:40Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Disentangling Neural Architectures and Weights: A Case Study in
Supervised Classification [8.976788958300766]
This work investigates the problem of disentangling the role of the neural structure and its edge weights.
We show that well-trained architectures may not need any link-specific fine-tuning of the weights.
We use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.
arXiv Detail & Related papers (2020-09-11T11:22:22Z) - Adversarially Robust Neural Architectures [43.74185132684662]
This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework.
We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters.
Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
arXiv Detail & Related papers (2020-09-02T08:52:15Z) - Dataless Model Selection with the Deep Frame Potential [45.16941644841897]
We quantify networks by their intrinsic capacity for unique and robust representations.
We propose the deep frame potential: a measure of coherence that is approximately related to representation stability but has minimizers that depend only on network structure.
We validate its use as a criterion for model selection and demonstrate correlation with generalization error on a variety of common residual and densely connected network architectures.
arXiv Detail & Related papers (2020-03-30T23:27:25Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.