Related papers: SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning

SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning

URL: http://arxiv.org/abs/2110.10842v1
Date: Thu, 21 Oct 2021 00:58:20 GMT
Title: SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning
Authors: Yanli Liu, Bochen Guan, Qinwen Xu, Weiyi Li, and Shuxue Quan
Abstract summary: We develop a CNN pruning framework called SMOF, which Squeezes More Out of Filters by reducing both kernel size and the number of filter channels. SMOF is friendly to standard hardware devices without any customized low-level implementations. We also support these claims via extensive experiments on various CNN structures and general-purpose processors for mobile devices.
Score: 2.1481785388161536
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: For many years, the family of convolutional neural networks (CNNs) has been a workhorse in deep learning. Recently, many novel CNN structures have been designed to address increasingly challenging tasks. To make them work efficiently on edge devices, researchers have proposed various structured network pruning strategies to reduce their memory and computational cost. However, most of them only focus on reducing the number of filter channels per layer without considering the redundancy within individual filter channels. In this work, we explore pruning from another dimension, the kernel size. We develop a CNN pruning framework called SMOF, which Squeezes More Out of Filters by reducing both kernel size and the number of filter channels. Notably, SMOF is friendly to standard hardware devices without any customized low-level implementations, and the pruning effort by kernel size reduction does not suffer from the fixed-size width constraint in SIMD units of general-purpose processors. The pruned networks can be deployed effortlessly with significant running time reduction. We also support these claims via extensive experiments on various CNN structures and general-purpose processors for mobile devices.

Related papers

Deep Network Pruning: A Comparative Study on CNNs in Face Recognition [47.114282145442616]
We study methods for deep network compression applied to face recognition. The method is tested on three networks based on the small SqueezeNet (1.24M parameters) and the popular MobileNetv2 (3.5M) and ResNet50 (23.5M) We observe that a substantial percentage of filters can be removed with minimal performance loss.
arXiv Detail & Related papers (2024-05-28T15:57:58Z)
Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs [69.3939291118954]
Unstructured pruning is well suited to reduce the memory footprint of convolutional neural networks (CNNs) Standard unstructured pruning (SP) reduces the memory footprint of CNNs by setting filter elements to zero. We introduce interspace pruning (IP), a general tool to improve existing pruning methods.
arXiv Detail & Related papers (2022-03-15T11:50:45Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Manipulating Identical Filter Redundancy for Efficient Pruning on Deep and Complicated CNN [126.88224745942456]
We propose a novel Centripetal SGD (C-SGD) to make some filters identical, resulting in ideal redundancy patterns. C-SGD delivers better performance because the redundancy is better organized, compared to the existing methods.
arXiv Detail & Related papers (2021-07-30T06:18:19Z)
Spectral Leakage and Rethinking the Kernel Size in CNNs [10.432041176720842]
We show that the small size of CNN kernels make them susceptible to spectral leakage. We demonstrate improved classification accuracy over baselines with conventional $3times 3$ kernels. We also show that CNNs employing the Hamming window display increased robustness against certain types of adversarial attacks.
arXiv Detail & Related papers (2021-01-25T14:49:29Z)
CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics [13.38218193857018]
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks. CNNPruner allows users to interactively create pruning plans according to a desired goal on model size or accuracy.
arXiv Detail & Related papers (2020-09-08T02:08:20Z)
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision Applications [13.197955183748796]
A novel greedy approach called cluster pruning has been proposed, which provides a structured way of removing filters in a CNN. A low cost IoT hardware setup consisting of an Intel Movidius-NCS is proposed to deploy an edge-AI application using our proposed pruning methodology.
arXiv Detail & Related papers (2020-03-05T06:20:09Z)
Pruning CNN's with linear filter ensembles [0.0]
We use pruning to reduce the network size and -- implicitly -- the number of floating point operations (FLOPs) We develop a novel filter importance norm that is based on the change in the empirical loss caused by the presence or removal of a component from the network architecture. We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2020-01-22T16:52:06Z)
Filter Grafting for Deep Neural Networks [71.39169475500324]
Filter grafting aims to improve the representation capability of Deep Neural Networks (DNNs) We develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks. For example, the grafted MobileNetV2 outperforms the non-grafted MobileNetV2 by about 7 percent on CIFAR-100 dataset.
arXiv Detail & Related papers (2020-01-15T03:18:57Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.