Related papers: Approximation analysis of CNNs from a feature extraction view

Approximation analysis of CNNs from a feature extraction view

URL: http://arxiv.org/abs/2210.09041v2
Date: Tue, 2 Jan 2024 14:12:23 GMT
Title: Approximation analysis of CNNs from a feature extraction view
Authors: Jianfei Li, Han Feng, Ding-Xuan Zhou
Abstract summary: We establish some analysis for linear feature extraction by a deep multi-channel convolutional neural networks (CNNs) We give an exact construction presenting how linear features extraction can be conducted efficiently with multi-channel CNNs. Rates of function approximation by such deep networks implemented with channels and followed by fully-connected layers are investigated as well.
Score: 8.94250977764275
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning based on deep neural networks has been very successful in many practical applications, but it lacks enough theoretical understanding due to the network architectures and structures. In this paper we establish some analysis for linear feature extraction by a deep multi-channel convolutional neural networks (CNNs), which demonstrates the power of deep learning over traditional linear transformations, like Fourier, wavelets, redundant dictionary coding methods. Moreover, we give an exact construction presenting how linear features extraction can be conducted efficiently with multi-channel CNNs. It can be applied to lower the essential dimension for approximating a high dimensional function. Rates of function approximation by such deep networks implemented with channels and followed by fully-connected layers are investigated as well. Harmonic analysis for factorizing linear features into multi-resolution convolutions plays an essential role in our work. Nevertheless, a dedicate vectorization of matrices is constructed, which bridges 1D CNN and 2D CNN and allows us to have corresponding 2D analysis.

Related papers

Two-Dimensional Deep ReLU CNN Approximation for Korobov Functions: A Constructive Approach [13.218398833013293]
This paper investigates approximation capabilities of two-dimensional (2D) deep convolutional neural networks (CNNs) We focus on 2D CNNs, comprising multi-channel convolutional layers with zero-padding and ReLU activations, followed by a fully connected layer. We propose a fully constructive approach for building 2D CNNs to approximate Korobov functions and provide rigorous analysis of the complexity of the constructed networks.
arXiv Detail & Related papers (2025-03-11T02:15:09Z)
Approximating Latent Manifolds in Neural Networks via Vanishing Ideals [20.464009622419766]
We establish a connection between manifold learning and computational algebra by demonstrating how vanishing ideals can characterize the latent manifold of deep networks. We propose a new neural architecture that truncates a pretrained network at an intermediate layer, and approximates each class manifold via generators of the vanishing ideal. The resulting models have significantly fewer layers than their pretrained baselines, while maintaining comparable accuracy, achieving higher throughput and utilizing fewer parameters.
arXiv Detail & Related papers (2025-02-20T21:23:02Z)
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks [5.851101657703105]
We take a first step towards theoretically characterizing the conditioning of the Gauss-Newton (GN) matrix in neural networks. We establish tight bounds on the condition number of the GN in deep linear networks of arbitrary depth and width. We expand the analysis to further architectural components, such as residual connections and convolutional layers.
arXiv Detail & Related papers (2024-11-04T14:56:48Z)
Convergence Analysis for Deep Sparse Coding via Convolutional Neural Networks [7.956678963695681]
We introduce a novel class of Deep Sparse Coding (DSC) models. We derive convergence rates for CNNs in their ability to extract sparse features. Inspired by the strong connection between sparse coding and CNNs, we explore training strategies to encourage neural networks to learn more sparse features.
arXiv Detail & Related papers (2024-08-10T12:43:55Z)
TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture. To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer. In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z)
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport [32.39176908225668]
We introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for measuring the non-linearity of deep neural networks. We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature.
arXiv Detail & Related papers (2023-10-17T17:50:22Z)
Understanding Deep Neural Networks via Linear Separability of Hidden Layers [68.23950220548417]
We first propose Minkowski difference based linear separability measures (MD-LSMs) to evaluate the linear separability degree of two points sets. We demonstrate that there is a synchronicity between the linear separability degree of hidden layer outputs and the network training performance.
arXiv Detail & Related papers (2023-07-26T05:29:29Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks. This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We show that linear networks make provably optimal predictions at infinite depth. We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z)
Knowledge Distillation Circumvents Nonlinearity for Optical Convolutional Neural Networks [4.683612295430957]
We propose a Spectral CNN Linear Counterpart (SCLC) network architecture and develop a Knowledge Distillation (KD) approach to circumvent the need for a nonlinearity. We show that the KD approach can achieve performance that easily surpasses the standard linear version of a CNN and could approach the performance of the nonlinear network.
arXiv Detail & Related papers (2021-02-26T06:35:34Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.