Auto Graph Encoder-Decoder for Neural Network Pruning
- URL: http://arxiv.org/abs/2011.12641v3
- Date: Tue, 9 Nov 2021 16:39:02 GMT
- Title: Auto Graph Encoder-Decoder for Neural Network Pruning
- Authors: Sixing Yu, Arya Mazaheri, Ali Jannesari
- Abstract summary: We propose an automatic graph encoder-decoder model compression (AGMC) method combined with graph neural networks (GNN) and reinforcement learning (RL)
Results show that our learning-based DNN embedding achieves better performance and a higher compression ratio with fewer search steps.
- Score: 0.8164433158925593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model compression aims to deploy deep neural networks (DNN) on mobile devices
with limited computing and storage resources. However, most of the existing
model compression methods rely on manually defined rules, which require domain
expertise. DNNs are essentially computational graphs, which contain rich
structural information. In this paper, we aim to find a suitable compression
policy from DNNs' structural information. We propose an automatic graph
encoder-decoder model compression (AGMC) method combined with graph neural
networks (GNN) and reinforcement learning (RL). We model the target DNN as a
graph and use GNN to learn the DNN's embeddings automatically. We compared our
method with rule-based DNN embedding model compression methods to show the
effectiveness of our method. Results show that our learning-based DNN embedding
achieves better performance and a higher compression ratio with fewer search
steps. We evaluated our method on over-parameterized and mobile-friendly DNNs
and compared our method with handcrafted and learning-based model compression
approaches. On over parameterized DNNs, such as ResNet-56, our method
outperformed handcrafted and learning-based methods with $4.36\%$ and $2.56\%$
higher accuracy, respectively. Furthermore, on MobileNet-v2, we achieved a
higher compression ratio than state-of-the-art methods with just $0.93\%$
accuracy loss.
Related papers
- Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - Graph Neural Network for Accurate and Low-complexity SAR ATR [2.9766397696234996]
We propose a graph neural network (GNN) model to achieve accurate and low-latency SAR ATR.
The proposed GNN model has low computation complexity and achieves comparable high accuracy.
Compared with the state-of-the-art CNNs, the proposed GNN model has only 1/3000 computation cost and 1/80 model size.
arXiv Detail & Related papers (2023-05-11T20:17:41Z) - Learning Graph Neural Networks using Exact Compression [2.213723689024101]
We study exact compression as a way to reduce the memory requirements of learning GNNs on large graphs.
In particular, we adopt a formal approach to compression and propose a methodology that transforms GNN learning problems into provably equivalent compressed GNN learning problems.
arXiv Detail & Related papers (2023-04-28T12:04:28Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Comparison Analysis of Traditional Machine Learning and Deep Learning
Techniques for Data and Image Classification [62.997667081978825]
The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks.
Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN)
Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture.
arXiv Detail & Related papers (2022-04-11T11:34:43Z) - MGDCF: Distance Learning via Markov Graph Diffusion for Neural
Collaborative Filtering [96.65234340724237]
We show the equivalence between some state-of-the-art GNN-based CF models and a traditional 1-layer NRL model based on context encoding.
We present Markov Graph Diffusion Collaborative Filtering (MGDCF) to generalize some state-of-the-art GNN-based CF models.
arXiv Detail & Related papers (2022-04-05T17:24:32Z) - Training Graph Neural Networks with 1000 Layers [133.84813995275988]
We study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs.
To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude.
arXiv Detail & Related papers (2021-06-14T15:03:00Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - GNN-RL Compression: Topology-Aware Network Pruning using Multi-stage
Graph Embedding and Reinforcement Learning [1.426627267770156]
We propose a novel multi-stage graph embedding technique based on graph neural networks (GNNs) to identify the DNNs' topology.
We performed resource-constrained (i.e., FLOPs) channel pruning and compared our approach with state-of-the-art compression methods.
Our method outperformed state-of-the-art methods and achieved a higher accuracy by up to 1.84% for ShuffleNet-v1.
arXiv Detail & Related papers (2021-02-05T14:59:32Z) - Utilizing Explainable AI for Quantization and Pruning of Deep Neural
Networks [0.495186171543858]
Recent efforts to understand and explain AI (Artificial Intelligence) methods have led to a new research area, termed as explainable AI.
Recent efforts to understand and explain AI (Artificial Intelligence) methods have led to a new research area, termed as explainable AI.
In this paper, we utilize explainable AI methods: mainly DeepLIFT method.
arXiv Detail & Related papers (2020-08-20T16:52:58Z) - Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism
Principled Robust Deep Neural Nets [13.102014808597264]
This paper focuses on a co-design of efficient compression algorithms and sparse neural architectures for robust and accurate deep learning.
We leverage the relaxed augmented Lagrangian based algorithms to prune the weights of adversarially trained DNNs.
Using a Feynman-Kac formalism principled robust and sparse DNNs, we can at least double the channel sparsity of the adversarially trained ResNet20 for CIFAR10 classification.
arXiv Detail & Related papers (2020-03-02T02:18:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.