Related papers: Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks

Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks

URL: http://arxiv.org/abs/1910.09455v3
Date: Sat, 23 Sep 2023 05:28:20 GMT
Title: Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks
Authors: Yihui He, Jianing Qian, Jianren Wang, Cindy X. Le, Congrui Hetang, Qi Lyu, Wenping Wang, Tianwei Yue
Abstract summary: Deep convolutional neural networks (CNNs) have been established as the primary methods for many computer vision tasks. Recently, depth-wise separable convolution has been proposed for image recognition tasks on computationally limited platforms. We propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for expanding regular convolutions into depthwise separable convolutions.
Score: 36.64158994999578
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Very deep convolutional neural networks (CNNs) have been firmly established as the primary methods for many computer vision tasks. However, most state-of-the-art CNNs are large, which results in high inference latency. Recently, depth-wise separable convolution has been proposed for image recognition tasks on computationally limited platforms such as robotics and self-driving cars. Though it is much faster than its counterpart, regular convolution, accuracy is sacrificed. In this paper, we propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for expanding regular convolutions into depthwise separable convolutions while maintaining high accuracy. We show our approach can be further generalized to the multi-channel and multi-layer cases, based on Generalized Singular Value Decomposition (GSVD) [59]. We conduct thorough experiments with the latest ShuffleNet V2 model [47] on both random synthesized dataset and a large-scale image recognition dataset: ImageNet [10]. Our approach outperforms channel decomposition [73] on all datasets. More importantly, our approach improves the Top-1 accuracy of ShuffleNet V2 by ~2%.

Related papers

The R2D2 deep neural network series paradigm for fast precision imaging in radio astronomy [1.7249361224827533]
Recent image reconstruction techniques have remarkable capability for imaging precision, well beyond CLEAN's capability. We introduce a novel deep learning approach, dubbed "Residual-to-Residual DNN series for high-Dynamic range imaging" R2D2's capability to deliver high precision is demonstrated in simulation, across a variety image observation settings using the Very Large Array (VLA)
arXiv Detail & Related papers (2024-03-08T16:57:54Z)
Revisiting Deformable Convolution for Depth Completion [40.45231083385708]
Depth completion aims to generate high-quality dense depth maps from sparse depth maps. Previous work usually employs RGB images as guidance, and introduces iterative spatial propagation to refine estimated coarse depth maps. We propose an effective architecture that leverages deformable kernel convolution as a single-pass refinement module.
arXiv Detail & Related papers (2023-08-03T17:59:06Z)
Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [24.897377434844266]
We propose a novel structure and training strategy for monocular depth estimation. We deploy a hierarchical transformer encoder to capture and convey the global context, and design a lightweight yet powerful decoder. Our network achieves state-of-the-art performance over the challenging depth dataset NYU Depth V2.
arXiv Detail & Related papers (2022-01-19T06:37:21Z)
Towards Comprehensive Monocular Depth Estimation: Multiple Heads Are Better Than One [32.01675089157679]
We propose to integrate the strengths of multiple weak depth predictor to build a comprehensive and accurate depth predictor. Specifically, we construct multiple base (weak) depth predictors by utilizing different Transformer-based and convolutional neural network (CNN)-based architectures. The resultant model, which we refer to as Transformer-assisted depth ensembles (TEDepth), achieves better results than previous state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-16T09:09:05Z)
FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic Arrays [2.8583189395674653]
We propose FuSeConv as a drop-in replacement for depth-wise separable convolution. FuSeConv generalizes the decomposition of convolutions fully to separable 1D convolutions along spatial and depth dimensions. We achieve a significant speed-up of 3x-7x with the MobileNet family of networks on a systolic array of size 64x64, with comparable accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2021-05-27T20:19:39Z)
Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts. We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively. Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively. Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z)
DO-Conv: Depthwise Over-parameterized Convolutional Layer [66.46704754669169]
We propose to augment a convolutional layer with an additional depthwise convolution, where each input channel is convolved with a different 2D kernel. We show with extensive experiments that the mere replacement of conventional convolutional layers with DO-Conv layers boosts the performance of CNNs.
arXiv Detail & Related papers (2020-06-22T06:57:10Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart. We show how to build a strong baseline, which already achieves state-of-the-art accuracy. We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.