Related papers: Cyclic orthogonal convolutions for long-range integration of features

Cyclic orthogonal convolutions for long-range integration of features

URL: http://arxiv.org/abs/2012.06462v1
Date: Fri, 11 Dec 2020 16:33:48 GMT
Title: Cyclic orthogonal convolutions for long-range integration of features
Authors: Federica Freddi, Jezabel R Garcia, Michael Bromberg, Sepehr Jalali, Da-Shan Shiu, Alvin Chua, Alberto Bernacchia
Abstract summary: We propose a novel architecture that allows flexible information flow between features $z$ and locations $(x,y)$ across the entire image. This architecture uses a cycle of three convolutions, not only in $(x,y)$ coordinates, but also in $(x,z)$ and $(y,z)$ coordinates. Our model obtains competitive results at image classification on CIFAR-10 and ImageNet datasets, when compared to CNNs of similar size.
Score: 3.309593266039024
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In Convolutional Neural Networks (CNNs) information flows across a small neighbourhood of each pixel of an image, preventing long-range integration of features before reaching deep layers in the network. We propose a novel architecture that allows flexible information flow between features $z$ and locations $(x,y)$ across the entire image with a small number of layers. This architecture uses a cycle of three orthogonal convolutions, not only in $(x,y)$ coordinates, but also in $(x,z)$ and $(y,z)$ coordinates. We stack a sequence of such cycles to obtain our deep network, named CycleNet. As this only requires a permutation of the axes of a standard convolution, its performance can be directly compared to a CNN. Our model obtains competitive results at image classification on CIFAR-10 and ImageNet datasets, when compared to CNNs of similar size. We hypothesise that long-range integration favours recognition of objects by shape rather than texture, and we show that CycleNet transfers better than CNNs to stylised images. On the Pathfinder challenge, where integration of distant features is crucial, CycleNet outperforms CNNs by a large margin. We also show that even when employing a small convolutional kernel, the size of receptive fields of CycleNet reaches its maximum after one cycle, while conventional CNNs require a large number of layers.

Related papers

3DPyranet Features Fusion for Spatio-temporal Feature Learning [2.327279581393927]
3D pyramidal neural pyramid called 3DPyraNet and a discriminative approach for classifier-temporal feature learning called 3DPyraNet-F are proposed. 3DPyraNet-F extract the features maps of the highest layer of the learned network, fuse them in a single vector, and provide it as input in a way to a linear-SVM. Results are reported with 3DPyraNet in real-world environments, especially in the presence of camera induced motion.
arXiv Detail & Related papers (2025-04-26T17:32:37Z)
Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN [47.205463459723056]
We present the Continuous Convolutional Neural Network (CCNN), a single CNN able to process data of arbitrary resolution, dimensionality and length without any structural changes. Its key component are its continuous convolutional kernels which model long-range dependencies at every layer. Our CCNN matches and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2023-01-25T12:12:47Z)
Towards a General Purpose CNN for Long Range Dependencies in $\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes. We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$) Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z)
Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs) In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers. The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
Convolution-Free Medical Image Segmentation using Transformers [8.130670465411239]
We show that a different method, based entirely on self-attention between neighboring image patches, can achieve competitive or better results. We show that the proposed model can achieve segmentation accuracies that are better than the state of the art CNNs on three datasets.
arXiv Detail & Related papers (2021-02-26T18:49:13Z)
MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs) Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.