Learning Structures for Deep Neural Networks
- URL: http://arxiv.org/abs/2105.13905v1
- Date: Thu, 27 May 2021 12:27:24 GMT
- Title: Learning Structures for Deep Neural Networks
- Authors: Jinhui Yuan and Fei Pan and Chunting Zhou and Tao Qin and Tie-Yan Liu
- Abstract summary: We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
- Score: 99.8331363309895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we focus on the unsupervised setting for structure learning of
deep neural networks and propose to adopt the efficient coding principle,
rooted in information theory and developed in computational neuroscience, to
guide the procedure of structure learning without label information. This
principle suggests that a good network structure should maximize the mutual
information between inputs and outputs, or equivalently maximize the entropy of
outputs under mild assumptions. We further establish connections between this
principle and the theory of Bayesian optimal classification, and empirically
verify that larger entropy of the outputs of a deep neural network indeed
corresponds to a better classification accuracy. Then as an implementation of
the principle, we show that sparse coding can effectively maximize the entropy
of the output signals, and accordingly design an algorithm based on global
group sparse coding to automatically learn the inter-layer connection and
determine the depth of a neural network. Our experiments on a public image
classification dataset demonstrate that using the structure learned from
scratch by our proposed algorithm, one can achieve a classification accuracy
comparable to the best expert-designed structure (i.e., convolutional neural
networks (CNN)). In addition, our proposed algorithm successfully discovers the
local connectivity (corresponding to local receptive fields in CNN) and
invariance structure (corresponding to pulling in CNN), as well as achieves a
good tradeoff between marginal performance gain and network depth.
Related papers
- Statistical Physics of Deep Neural Networks: Initialization toward
Optimal Channels [6.144858413112823]
In deep learning, neural networks serve as noisy channels between input data and its representation.
We study a frequently overlooked possibility that neural networks can be intrinsic toward optimal channels.
arXiv Detail & Related papers (2022-12-04T05:13:01Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - The Principles of Deep Learning Theory [19.33681537640272]
This book develops an effective theory approach to understanding deep neural networks of practical relevance.
We explain how these effectively-deep networks learn nontrivial representations from training.
We show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks.
arXiv Detail & Related papers (2021-06-18T15:00:00Z) - Credit Assignment in Neural Networks through Deep Feedback Control [59.14935871979047]
Deep Feedback Control (DFC) is a new learning method that uses a feedback controller to drive a deep neural network to match a desired output target and whose control signal can be used for credit assignment.
The resulting learning rule is fully local in space and time and approximates Gauss-Newton optimization for a wide range of connectivity patterns.
To further underline its biological plausibility, we relate DFC to a multi-compartment model of cortical pyramidal neurons with a local voltage-dependent synaptic plasticity rule, consistent with recent theories of dendritic processing.
arXiv Detail & Related papers (2021-06-15T05:30:17Z) - Generalized Approach to Matched Filtering using Neural Networks [4.535489275919893]
We make a key observation on the relationship between the emerging deep learning and the traditional techniques.
matched filtering is formally equivalent to a particular neural network.
We show that the proposed neural network architecture can outperform matched filtering.
arXiv Detail & Related papers (2021-04-08T17:59:07Z) - Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks.
We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.