Related papers: AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

URL: http://arxiv.org/abs/2008.06543v1
Date: Fri, 14 Aug 2020 18:48:13 GMT
Title: AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency
Authors: Fuxun Yu, Chenchen Liu, Di Wang, Yanzhi Wang, Xiang Chen
Abstract summary: We propose a dynamic CNN optimization framework based on the neural network attention mechanism. Our method could bring 37.4% to 54.5% FLOPs reduction with negligible accuracy drop on various test networks.
Score: 42.00372941618975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional Neural Networks (CNNs) achieved great cognitive performance at the expense of considerable computation load. To relieve the computation load, many optimization works are developed to reduce the model redundancy by identifying and removing insignificant model components, such as weight sparsity and filter pruning. However, these works only evaluate model components' static significance with internal parameter information, ignoring their dynamic interaction with external inputs. With per-input feature activation, the model component significance can dynamically change, and thus the static methods can only achieve sub-optimal results. Therefore, we propose a dynamic CNN optimization framework in this work. Based on the neural network attention mechanism, we propose a comprehensive dynamic optimization framework including (1) testing-phase channel and column feature map pruning, as well as (2) training-phase optimization by targeted dropout. Such a dynamic optimization framework has several benefits: (1) First, it can accurately identify and aggressively remove per-input feature redundancy with considering the model-input interaction; (2) Meanwhile, it can maximally remove the feature map redundancy in various dimensions thanks to the multi-dimension flexibility; (3) The training-testing co-optimization favors the dynamic pruning and helps maintain the model accuracy even with very high feature pruning ratio. Extensive experiments show that our method could bring 37.4% to 54.5% FLOPs reduction with negligible accuracy drop on various of test networks.

Related papers

DyCE: Dynamic Configurable Exiting for Deep Learning Compression and Scaling [1.9686770963118378]
DyCE is a dynamic early-exit framework that decouples design considerations from each other and from the base model. It significantly reduces the computational complexity by 23.5% of ResNet152 and 25.9% of ConvNextv2-tiny on ImageNet, with accuracy reductions of less than 0.5%.
arXiv Detail & Related papers (2024-03-04T03:09:28Z)
Model-Based Control with Sparse Neural Dynamics [23.961218902837807]
We propose a new framework for integrated model learning and predictive control. We show that our framework can deliver better closed-loop performance than existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-20T06:25:02Z)
Dynamically configured physics-informed neural network in topology optimization applications [4.403140515138818]
The physics-informed neural network (PINN) can avoid generating enormous amounts of data when solving forward problems. A dynamically configured PINN-based topology optimization (DCPINN-TO) method is proposed. The accuracy of the displacement prediction and optimization results indicate that the DCPINN-TO method is effective and efficient.
arXiv Detail & Related papers (2023-12-12T05:35:30Z)
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale. This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z)
Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks. It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping. It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z)
Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance. distinct models are required to be trained to reach different points in the rate-distortion (R-D) space. We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z)
Dynamic Slimmable Network [105.74546828182834]
We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net) Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate. It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
arXiv Detail & Related papers (2021-03-24T15:25:20Z)
CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics [13.38218193857018]
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks. CNNPruner allows users to interactively create pruning plans according to a desired goal on model size or accuracy.
arXiv Detail & Related papers (2020-09-08T02:08:20Z)
Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features. We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z)
Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.