Approximation analysis of CNNs from a feature extraction view
- URL: http://arxiv.org/abs/2210.09041v2
- Date: Tue, 2 Jan 2024 14:12:23 GMT
- Title: Approximation analysis of CNNs from a feature extraction view
- Authors: Jianfei Li, Han Feng, Ding-Xuan Zhou
- Abstract summary: We establish some analysis for linear feature extraction by a deep multi-channel convolutional neural networks (CNNs)
We give an exact construction presenting how linear features extraction can be conducted efficiently with multi-channel CNNs.
Rates of function approximation by such deep networks implemented with channels and followed by fully-connected layers are investigated as well.
- Score: 8.94250977764275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning based on deep neural networks has been very successful in many
practical applications, but it lacks enough theoretical understanding due to
the network architectures and structures. In this paper we establish some
analysis for linear feature extraction by a deep multi-channel convolutional
neural networks (CNNs), which demonstrates the power of deep learning over
traditional linear transformations, like Fourier, wavelets, redundant
dictionary coding methods. Moreover, we give an exact construction presenting
how linear features extraction can be conducted efficiently with multi-channel
CNNs. It can be applied to lower the essential dimension for approximating a
high dimensional function. Rates of function approximation by such deep
networks implemented with channels and followed by fully-connected layers are
investigated as well. Harmonic analysis for factorizing linear features into
multi-resolution convolutions plays an essential role in our work.
Nevertheless, a dedicate vectorization of matrices is constructed, which
bridges 1D CNN and 2D CNN and allows us to have corresponding 2D analysis.
Related papers
- Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks [5.851101657703105]
We take a first step towards theoretically characterizing the conditioning of the Gauss-Newton (GN) matrix in neural networks.
We establish tight bounds on the condition number of the GN in deep linear networks of arbitrary depth and width.
We expand the analysis to further architectural components, such as residual connections and convolutional layers.
arXiv Detail & Related papers (2024-11-04T14:56:48Z) - Convergence Analysis for Deep Sparse Coding via Convolutional Neural Networks [7.956678963695681]
We introduce a novel class of Deep Sparse Coding (DSC) models.
We derive convergence rates for CNNs in their ability to extract sparse features.
Inspired by the strong connection between sparse coding and CNNs, we explore training strategies to encourage neural networks to learn more sparse features.
arXiv Detail & Related papers (2024-08-10T12:43:55Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport [32.39176908225668]
We introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for measuring the non-linearity of deep neural networks.
We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature.
arXiv Detail & Related papers (2023-10-17T17:50:22Z) - Understanding Deep Neural Networks via Linear Separability of Hidden
Layers [68.23950220548417]
We first propose Minkowski difference based linear separability measures (MD-LSMs) to evaluate the linear separability degree of two points sets.
We demonstrate that there is a synchronicity between the linear separability degree of hidden layer outputs and the network training performance.
arXiv Detail & Related papers (2023-07-26T05:29:29Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Knowledge Distillation Circumvents Nonlinearity for Optical
Convolutional Neural Networks [4.683612295430957]
We propose a Spectral CNN Linear Counterpart (SCLC) network architecture and develop a Knowledge Distillation (KD) approach to circumvent the need for a nonlinearity.
We show that the KD approach can achieve performance that easily surpasses the standard linear version of a CNN and could approach the performance of the nonlinear network.
arXiv Detail & Related papers (2021-02-26T06:35:34Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.