Related papers: EAPruning: Evolutionary Pruning for Vision Transformers and CNNs

EAPruning: Evolutionary Pruning for Vision Transformers and CNNs

URL: http://arxiv.org/abs/2210.00181v1
Date: Sat, 1 Oct 2022 03:38:56 GMT
Title: EAPruning: Evolutionary Pruning for Vision Transformers and CNNs
Authors: Qingyuan Li, Bo Zhang, Xiangxiang Chu
Abstract summary: We undertake a simple and effective approach that can be easily applied to both vision transformers and convolutional neural networks. We achieve a 50% FLOPS reduction for ResNet50 and MobileNetV1, leading to 1.37x and 1.34x speedup respectively.
Score: 11.994217333212736
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Structured pruning greatly eases the deployment of large neural networks in resource-constrained environments. However, current methods either involve strong domain expertise, require extra hyperparameter tuning, or are restricted only to a specific type of network, which prevents pervasive industrial applications. In this paper, we undertake a simple and effective approach that can be easily applied to both vision transformers and convolutional neural networks. Specifically, we consider pruning as an evolution process of sub-network structures that inherit weights through reconstruction techniques. We achieve a 50% FLOPS reduction for ResNet50 and MobileNetV1, leading to 1.37x and 1.34x speedup respectively. For DeiT-Base, we reach nearly 40% FLOPs reduction and 1.4x speedup. Our code will be made available.

Related papers

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs. We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z)
Pruning Very Deep Neural Network Channels for Efficient Inference [6.497816402045099]
Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer. VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. Our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively.
arXiv Detail & Related papers (2022-11-14T06:48:33Z)
CHEX: CHannel EXploration for CNN Model Compression [47.3520447163165]
We propose a novel Channel Exploration methodology, dubbed as CHEX, to rectify these problems. CheX repeatedly prunes and regrows the channels throughout the training process, which reduces the risk of pruning important channels prematurely. Results demonstrate that CHEX can effectively reduce the FLOPs of diverse CNN architectures on a variety of computer vision tasks.
arXiv Detail & Related papers (2022-03-29T17:52:41Z)
FQ-ViT: Fully Quantized Vision Transformer without Retraining [13.82845665713633]
We present a systematic method to reduce the performance degradation and inference complexity of Quantized Transformers. We are the first to achieve comparable accuracy degradation (1%) on fully quantized Vision Transformers.
arXiv Detail & Related papers (2021-11-27T06:20:53Z)
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels. We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
Layer Folding: Neural Network Depth Reduction using Activation Linearization [0.0]
Modern devices exhibit a high level of parallelism, but real-time latency is still highly dependent on networks' depth. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We apply our method to networks pre-trained on CIFAR-10 and CIFAR-100 and find that they can all be transformed into shallower forms that share a similar depth.
arXiv Detail & Related papers (2021-06-17T08:22:46Z)
Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks. specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples. We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z)
Container: Context Aggregation Network [83.12004501984043]
Recent finding shows that a simple based solution without any traditional convolutional or Transformer components can produce effective visual representations. We present the model (CONText Ion NERtwok), a general-purpose building block for multi-head context aggregation. In contrast to Transformer-based methods that do not scale well to downstream tasks that rely on larger input image resolutions, our efficient network, named modellight, can be employed in object detection and instance segmentation networks.
arXiv Detail & Related papers (2021-06-02T18:09:11Z)
IC Networks: Remodeling the Basic Unit for Convolutional Neural Networks [8.218732270970381]
"Inter-layer Collision" (IC) structure can be integrated into existing CNNs to improve their performance. New training method, namely weak logit distillation (WLD), is proposed to speed up the training of IC networks. In the ImageNet experiment, we integrate the IC structure into ResNet-50 and reduce the top-1 error from 22.38% to 21.75%.
arXiv Detail & Related papers (2021-02-06T03:15:43Z)
MicroNet: Towards Image Recognition with Extremely Low FLOPs [117.96848315180407]
MicroNet is an efficient convolutional neural network using extremely low computational cost. A family of MicroNets achieve a significant performance gain over the state-of-the-art in the low FLOP regime. For instance, MicroNet-M1 achieves 61.1% top-1 accuracy on ImageNet classification with 12 MFLOPs, outperforming MobileNetV3 by 11.3%.
arXiv Detail & Related papers (2020-11-24T18:59:39Z)
Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning. We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset. We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.