On the Universal Approximation Property of Deep Fully Convolutional
Neural Networks
- URL: http://arxiv.org/abs/2211.14047v2
- Date: Thu, 18 May 2023 02:40:10 GMT
- Title: On the Universal Approximation Property of Deep Fully Convolutional
Neural Networks
- Authors: Ting Lin, Zuowei Shen, Qianxiao Li
- Abstract summary: We prove that deep residual fully convolutional networks and their continuous-layer counterpart can achieve universal approximation of symmetric functions at constant channel width.
We show that these requirements are necessary, in the sense that networks with fewer channels or smaller kernels fail to be universal approximators.
- Score: 15.716533830931766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the approximation of shift-invariant or equivariant functions by
deep fully convolutional networks from the dynamical systems perspective. We
prove that deep residual fully convolutional networks and their
continuous-layer counterpart can achieve universal approximation of these
symmetric functions at constant channel width. Moreover, we show that the same
can be achieved by non-residual variants with at least 2 channels in each layer
and convolutional kernel size of at least 2. In addition, we show that these
requirements are necessary, in the sense that networks with fewer channels or
smaller kernels fail to be universal approximators.
Related papers
- A Hybrid Transformer-Mamba Network for Single Image Deraining [70.64069487982916]
Existing deraining Transformers employ self-attention mechanisms with fixed-range windows or along channel dimensions.
We introduce a novel dual-branch hybrid Transformer-Mamba network, denoted as TransMamba, aimed at effectively capturing long-range rain-related dependencies.
arXiv Detail & Related papers (2024-08-31T10:03:19Z) - Chiral excitation flows of multinode network based on synthetic gauge
fields [0.0]
Chiral excitation flows have drawn a lot of attention for their unique unidirectionality.
Such flows have been studied in three-node networks with synthetic gauge fields (SGFs)
We propose a scheme to achieve chiral flows in $n$-node networks, where an auxiliary node is introduced to govern the system.
arXiv Detail & Related papers (2023-12-04T16:31:02Z) - Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks [42.14352997147652]
We investigate the analogous classification for *depthwise parametrizations* of deep residual networks (resnets)
In resnets where each block has only one layer, we identify a unique optimal parametrization, called Depth-$mu$P.
We find that Depth-$mu$P can be characterized as maximizing both feature learning and feature diversity.
arXiv Detail & Related papers (2023-10-03T17:50:40Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - Channel redundancy and overlap in convolutional neural networks with
channel-wise NNK graphs [36.479195100553085]
Feature spaces in the deep layers of convolutional neural networks (CNNs) are often very high-dimensional and difficult to interpret.
We analyze theoretically channel-wise non-negative kernel (CW-NNK) regression graphs to quantify the overlap between channels.
We find that redundancy between channels is significant and varies with the layer depth and the level of regularization.
arXiv Detail & Related papers (2021-10-18T22:50:07Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN)
VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely.
Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z) - Complex networks with tuneable dimensions as a universality playground [0.0]
We discuss the role of a fundamental network parameter for universality, the spectral dimension.
By explicit computation we prove that the spectral dimension for this model can be tuned continuously from $1$ to infinity.
We propose our model as a tool to probe universal behaviour on inhomogeneous structures and comment on the possibility that the universal behaviour of correlated models on such networks mimics the one of continuous field theories in fractional Euclidean dimensions.
arXiv Detail & Related papers (2020-06-18T10:56:41Z) - Quasi-Equivalence of Width and Depth of Neural Networks [10.365556153676538]
We investigate if the design of artificial neural networks should have a directional preference.
Inspired by the De Morgan law, we establish a quasi-equivalence between the width and depth of ReLU networks.
Based on our findings, a deep network has a wide equivalent, subject to an arbitrarily small error.
arXiv Detail & Related papers (2020-02-06T21:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.