Modelling Long Range Dependencies in $N$D: From Task-Specific to a
General Purpose CNN
- URL: http://arxiv.org/abs/2301.10540v2
- Date: Sun, 16 Apr 2023 08:55:36 GMT
- Title: Modelling Long Range Dependencies in $N$D: From Task-Specific to a
General Purpose CNN
- Authors: David M. Knigge, David W. Romero, Albert Gu, Efstratios Gavves, Erik
J. Bekkers, Jakub M. Tomczak, Mark Hoogendoorn, Jan-Jakob Sonke
- Abstract summary: We present the Continuous Convolutional Neural Network (CCNN), a single CNN able to process data of arbitrary resolution, dimensionality and length without any structural changes.
Its key component are its continuous convolutional kernels which model long-range dependencies at every layer.
Our CCNN matches and often outperforms the current state-of-the-art across all tasks considered.
- Score: 47.205463459723056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Performant Convolutional Neural Network (CNN) architectures must be tailored
to specific tasks in order to consider the length, resolution, and
dimensionality of the input data. In this work, we tackle the need for
problem-specific CNN architectures. We present the Continuous Convolutional
Neural Network (CCNN): a single CNN able to process data of arbitrary
resolution, dimensionality and length without any structural changes. Its key
component are its continuous convolutional kernels which model long-range
dependencies at every layer, and thus remove the need of current CNN
architectures for task-dependent downsampling and depths. We showcase the
generality of our method by using the same architecture for tasks on sequential
($1{\rm D}$), visual ($2{\rm D}$) and point-cloud ($3{\rm D}$) data. Our CCNN
matches and often outperforms the current state-of-the-art across all tasks
considered.
Related papers
- Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications.
Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z) - Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - A Domain Decomposition-Based CNN-DNN Architecture for Model Parallel Training Applied to Image Recognition Problems [0.0]
A novel CNN-DNN architecture is proposed that naturally supports a model parallel training strategy.
The proposed approach can significantly accelerate the required training time compared to the global model.
Results show that the proposed approach can also help to improve the accuracy of the underlying classification problem.
arXiv Detail & Related papers (2023-02-13T18:06:59Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Towards a General Purpose CNN for Long Range Dependencies in
$\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes.
We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$)
Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Cyclic orthogonal convolutions for long-range integration of features [3.309593266039024]
We propose a novel architecture that allows flexible information flow between features $z$ and locations $(x,y)$ across the entire image.
This architecture uses a cycle of three convolutions, not only in $(x,y)$ coordinates, but also in $(x,z)$ and $(y,z)$ coordinates.
Our model obtains competitive results at image classification on CIFAR-10 and ImageNet datasets, when compared to CNNs of similar size.
arXiv Detail & Related papers (2020-12-11T16:33:48Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.