Depthwise-STFT based separable Convolutional Neural Networks
- URL: http://arxiv.org/abs/2001.09912v1
- Date: Mon, 27 Jan 2020 17:07:08 GMT
- Title: Depthwise-STFT based separable Convolutional Neural Networks
- Authors: Sudhakar Kumawat, Shanmuganathan Raman
- Abstract summary: We propose a new convolutional layer called Depthwise-STFT Separable layer.
It can serve as an alternative to the standard depthwise separable convolutional layer.
We show that the proposed layer outperforms the standard depthwise separable layer-based models on the CIFAR-10 and CIFAR-100 image classification datasets.
- Score: 35.636461829966095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a new convolutional layer called Depthwise-STFT
Separable layer that can serve as an alternative to the standard depthwise
separable convolutional layer. The construction of the proposed layer is
inspired by the fact that the Fourier coefficients can accurately represent
important features such as edges in an image. It utilizes the Fourier
coefficients computed (channelwise) in the 2D local neighborhood (e.g., 3x3) of
each position of the input map to obtain the feature maps. The Fourier
coefficients are computed using 2D Short Term Fourier Transform (STFT) at
multiple fixed low frequency points in the 2D local neighborhood at each
position. These feature maps at different frequency points are then linearly
combined using trainable pointwise (1x1) convolutions. We show that the
proposed layer outperforms the standard depthwise separable layer-based models
on the CIFAR-10 and CIFAR-100 image classification datasets with reduced
space-time complexity.
Related papers
- MDNF: Multi-Diffusion-Nets for Neural Fields on Meshes [5.284425534494986]
We propose a novel framework for representing neural fields on triangle meshes that is multi-resolution across both spatial and frequency domains.
Inspired by the Neural Fourier Filter Bank (NFFB), our architecture decomposes the frequencies and frequency domains by associating finer resolution levels with higher frequency bands.
We demonstrate the effectiveness of our approach through its application to diverse neural fields, such as synthetic RGB functions, UV texture coordinates, and normals.
arXiv Detail & Related papers (2024-09-04T19:08:13Z) - Neural Fourier Filter Bank [18.52741992605852]
We present a novel method to provide efficient and highly detailed reconstructions.
Inspired by wavelets, we learn a neural field that decompose the signal both spatially and frequency-wise.
arXiv Detail & Related papers (2022-12-04T03:45:08Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property.
We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z) - Block Walsh-Hadamard Transform Based Binary Layers in Deep Neural
Networks [7.906608953906891]
Convolution has been the core operation of modern deep neural networks.
We propose to use binary block Walsh-Hadamard transform (WHT) instead of the Fourier transform.
We use WHT-based binary layers to replace some of the regular convolution layers in deep neural networks.
arXiv Detail & Related papers (2022-01-07T23:52:41Z) - Learnable Fourier Features for Multi-DimensionalSpatial Positional
Encoding [96.9752763607738]
We propose a novel positional encoding method based on learnable Fourier features.
Our experiments show that our learnable feature representation for multi-dimensional positional encoding outperforms existing methods.
arXiv Detail & Related papers (2021-06-05T04:40:18Z) - Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human
Action Recognition [42.400429835080416]
Conventional 3D convolutional neural networks (CNNs) are computationally expensive, memory intensive, prone to overfitting and most importantly, there is a need to improve their feature learning capabilities.
We propose new class of convolutional blocks that can serve as an alternative to 3D convolutional layer and its variants in 3D CNNs.
Our evaluation on seven action recognition datasets, including Something-something v1 and v2, Jester, Diving Kinetics-400, UCF 101, and HMDB 51, demonstrate that STFT blocks based 3D CNNs achieve on par or even better performance compared to the state-of
arXiv Detail & Related papers (2020-07-22T12:26:04Z) - DO-Conv: Depthwise Over-parameterized Convolutional Layer [66.46704754669169]
We propose to augment a convolutional layer with an additional depthwise convolution, where each input channel is convolved with a different 2D kernel.
We show with extensive experiments that the mere replacement of conventional convolutional layers with DO-Conv layers boosts the performance of CNNs.
arXiv Detail & Related papers (2020-06-22T06:57:10Z) - Region adaptive graph fourier transform for 3d point clouds [51.193111325231165]
We introduce the Region Adaptive Graph Fourier Transform (RA-GFT) for compression of 3D point cloud attributes.
The RA-GFT achieves better complexity-performance trade-offs than previous approaches.
arXiv Detail & Related papers (2020-03-04T02:47:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.