AntiDote: Attention-based Dynamic Optimization for Neural Network
Runtime Efficiency
- URL: http://arxiv.org/abs/2008.06543v1
- Date: Fri, 14 Aug 2020 18:48:13 GMT
- Title: AntiDote: Attention-based Dynamic Optimization for Neural Network
Runtime Efficiency
- Authors: Fuxun Yu, Chenchen Liu, Di Wang, Yanzhi Wang, Xiang Chen
- Abstract summary: We propose a dynamic CNN optimization framework based on the neural network attention mechanism.
Our method could bring 37.4% to 54.5% FLOPs reduction with negligible accuracy drop on various test networks.
- Score: 42.00372941618975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks (CNNs) achieved great cognitive performance at
the expense of considerable computation load. To relieve the computation load,
many optimization works are developed to reduce the model redundancy by
identifying and removing insignificant model components, such as weight
sparsity and filter pruning. However, these works only evaluate model
components' static significance with internal parameter information, ignoring
their dynamic interaction with external inputs. With per-input feature
activation, the model component significance can dynamically change, and thus
the static methods can only achieve sub-optimal results. Therefore, we propose
a dynamic CNN optimization framework in this work. Based on the neural network
attention mechanism, we propose a comprehensive dynamic optimization framework
including (1) testing-phase channel and column feature map pruning, as well as
(2) training-phase optimization by targeted dropout. Such a dynamic
optimization framework has several benefits: (1) First, it can accurately
identify and aggressively remove per-input feature redundancy with considering
the model-input interaction; (2) Meanwhile, it can maximally remove the feature
map redundancy in various dimensions thanks to the multi-dimension flexibility;
(3) The training-testing co-optimization favors the dynamic pruning and helps
maintain the model accuracy even with very high feature pruning ratio.
Extensive experiments show that our method could bring 37.4% to 54.5% FLOPs
reduction with negligible accuracy drop on various of test networks.
Related papers
- DyCE: Dynamic Configurable Exiting for Deep Learning Compression and
Scaling [1.9686770963118378]
DyCE is a dynamic early-exit framework that decouples design considerations from each other and from the base model.
It significantly reduces the computational complexity by 23.5% of ResNet152 and 25.9% of ConvNextv2-tiny on ImageNet, with accuracy reductions of less than 0.5%.
arXiv Detail & Related papers (2024-03-04T03:09:28Z) - Model-Based Control with Sparse Neural Dynamics [23.961218902837807]
We propose a new framework for integrated model learning and predictive control.
We show that our framework can deliver better closed-loop performance than existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-20T06:25:02Z) - Dynamically configured physics-informed neural network in topology
optimization applications [4.403140515138818]
The physics-informed neural network (PINN) can avoid generating enormous amounts of data when solving forward problems.
A dynamically configured PINN-based topology optimization (DCPINN-TO) method is proposed.
The accuracy of the displacement prediction and optimization results indicate that the DCPINN-TO method is effective and efficient.
arXiv Detail & Related papers (2023-12-12T05:35:30Z) - Explicit Foundation Model Optimization with Self-Attentive Feed-Forward
Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale.
This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Dynamic Slimmable Network [105.74546828182834]
We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net)
Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate.
It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
arXiv Detail & Related papers (2021-03-24T15:25:20Z) - CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics [13.38218193857018]
Convolutional neural networks (CNNs) have demonstrated extraordinarily good performance in many computer vision tasks.
CNNPruner allows users to interactively create pruning plans according to a desired goal on model size or accuracy.
arXiv Detail & Related papers (2020-09-08T02:08:20Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z) - Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by
Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images)
This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.