Efficient Higher-order Convolution for Small Kernels in Deep Learning
- URL: http://arxiv.org/abs/2404.16380v1
- Date: Thu, 25 Apr 2024 07:42:48 GMT
- Title: Efficient Higher-order Convolution for Small Kernels in Deep Learning
- Authors: Zuocheng Wen, Lingzhong Guo,
- Abstract summary: We propose a novel method to perform higher-order Volterra filtering with lower memory and computational costs.
Based on the proposed method, a new attention module called Higher-order Local Attention Block (HLA) is proposed and tested.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep convolutional neural networks (DCNNs) are a class of artificial neural networks, primarily for computer vision tasks such as segmentation and classification. Many nonlinear operations, such as activation functions and pooling strategies, are used in DCNNs to enhance their ability to process different signals with different tasks. Conceptional convolution, a linear filter, is the essential component of DCNNs while nonlinear convolution is generally implemented as higher-order Volterra filters, However, for Volterra filtering, significant memory and computational costs pose a primary limitation for its widespread application in DCNN applications. In this study, we propose a novel method to perform higher-order Volterra filtering with lower memory and computation cost in forward and backward pass in DCNN training. The proposed method demonstrates computational advantages compared with conventional Volterra filter implementation. Furthermore, based on the proposed method, a new attention module called Higher-order Local Attention Block (HLA) is proposed and tested on CIFAR-100 dataset, which shows competitive improvement for classification task. Source code is available at: https://github.com/WinterWen666/Efficient-High-Order-Volterra-Convolution.git
Related papers
- As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain [22.512062422338914]
Recent work in neural networks for image classification has seen a strong tendency towards increasing the spatial context.
We propose a module for studying the effective filter size of convolutional neural networks.
Our analysis shows that, although the proposed networks could learn very large convolution kernels, the learned filters are well localized and relatively small in practice.
arXiv Detail & Related papers (2023-07-19T14:21:11Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Kernel Based Progressive Distillation for Adder Neural Networks [71.731127378807]
Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption.
There is an accuracy drop when replacing all convolution filters by adder filters.
We present a novel method for further improving the performance of ANNs without increasing the trainable parameters.
arXiv Detail & Related papers (2020-09-28T03:29:19Z) - Deep Networks with Fast Retraining [0.0]
This paper proposes a novel MP inverse-based fast retraining strategy for deep convolutional neural network (DCNN) learning.
In each training, a random learning strategy that controls the number of convolutional layers trained in the backward pass is first utilized.
Then, an MP inverse-based batch-by-batch learning strategy, which enables the network to be implemented without access to industrial-scale computational resources, is developed.
arXiv Detail & Related papers (2020-08-13T15:17:38Z) - Learning Sparse Filters in Deep Convolutional Neural Networks with a
l1/l2 Pseudo-Norm [5.3791844634527495]
Deep neural networks (DNNs) have proven to be efficient for numerous tasks, but come at a high memory and computation cost.
Recent research has shown that their structure can be more compact without compromising their performance.
We present a sparsity-inducing regularization term based on the ratio l1/l2 pseudo-norm defined on the filter coefficients.
arXiv Detail & Related papers (2020-07-20T11:56:12Z) - A Neural Network Approach for Online Nonlinear Neyman-Pearson
Classification [3.6144103736375857]
We propose a novel Neyman-Pearson (NP) classifier that is both online and nonlinear as the first time in the literature.
The proposed classifier operates on a binary labeled data stream in an online manner, and maximizes the detection power about a user-specified and controllable false positive rate.
Our algorithm is appropriate for large scale data applications and provides a decent false positive rate controllability with real time processing.
arXiv Detail & Related papers (2020-06-14T20:00:25Z) - Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost.
Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors.
We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z) - Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality
Regularization and Singular Value Sparsification [53.50708351813565]
We propose SVD training, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step.
We empirically show that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy.
arXiv Detail & Related papers (2020-04-20T02:40:43Z) - Gradient Centralization: A New Optimization Technique for Deep Neural
Networks [74.935141515523]
gradient centralization (GC) operates directly on gradients by centralizing the gradient vectors to have zero mean.
GC can be viewed as a projected gradient descent method with a constrained loss function.
GC is very simple to implement and can be easily embedded into existing gradient based DNNs with only one line of code.
arXiv Detail & Related papers (2020-04-03T10:25:00Z) - Volterra Neural Networks (VNNs) [24.12314339259243]
We propose a Volterra filter-inspired Network architecture to reduce the complexity of Convolutional Neural Networks.
We show an efficient parallel implementation of this Volterra Neural Network (VNN) along with its remarkable performance.
The proposed approach is evaluated on UCF-101 and HMDB-51 datasets for action recognition, and is shown to outperform state of the art CNN approaches.
arXiv Detail & Related papers (2019-10-21T19:22:38Z) - Gabor Convolutional Networks [103.87356592690669]
We propose a new deep model, termed Gabor Convolutional Networks (GCNs), which incorporates Gabor filters into deep convolutional neural networks (DCNNs)
GCNs can be easily implemented and are compatible with any popular deep learning architecture.
Experimental results demonstrate the super capability of our algorithm in recognizing objects, where the scale and rotation changes occur frequently.
arXiv Detail & Related papers (2017-05-03T14:37:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.