STN: Scalable Tensorizing Networks via Structure-Aware Training and
Adaptive Compression
- URL: http://arxiv.org/abs/2205.15198v1
- Date: Mon, 30 May 2022 15:50:48 GMT
- Title: STN: Scalable Tensorizing Networks via Structure-Aware Training and
Adaptive Compression
- Authors: Chang Nie, Huan Wang, Lu Zhao
- Abstract summary: We propose Scalableizing Networks (STN), which adaptively adjust the model size and decomposition structure without retraining.
STN is compatible with arbitrary network architectures and achieves higher compression performance and flexibility over other tensorizing versions.
- Score: 10.067082377396586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have delivered a remarkable performance in many
tasks of computer vision. However, over-parameterized representations of
popular architectures dramatically increase their computational complexity and
storage costs, and hinder their availability in edge devices with constrained
resources. Regardless of many tensor decomposition (TD) methods that have been
well-studied for compressing DNNs to learn compact representations, they suffer
from non-negligible performance degradation in practice. In this paper, we
propose Scalable Tensorizing Networks (STN), which dynamically and adaptively
adjust the model size and decomposition structure without retraining. First, we
account for compression during training by adding a low-rank regularizer to
guarantee networks' desired low-rank characteristics in full tensor format.
Then, considering network layers exhibit various low-rank structures, STN is
obtained by a data-driven adaptive TD approach, for which the topological
structure of decomposition per layer is learned from the pre-trained model, and
the ranks are selected appropriately under specified storage constraints. As a
result, STN is compatible with arbitrary network architectures and achieves
higher compression performance and flexibility over other tensorizing versions.
Comprehensive experiments on several popular architectures and benchmarks
substantiate the superiority of our model towards improving parameter
efficiency.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning [17.454100169491497]
We propose a structured pruning approach based on the activity levels of convolutional kernels named Spiking Channel Activity-based (SCA) network pruning framework.
Inspired by synaptic plasticity mechanisms, our method dynamically adjusts the network's structure by pruning and regenerating convolutional kernels during training, enhancing the model's adaptation to the current target task.
arXiv Detail & Related papers (2024-06-03T07:44:37Z) - Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition [11.399520888150468]
We present a theoretically-justified technique termed Low-Rank Induced Training (LoRITa)
LoRITa promotes low-rankness through the composition of linear layers and compresses by using singular value truncation.
We demonstrate the effectiveness of our approach using MNIST on Fully Connected Networks, CIFAR10 on Vision Transformers, and CIFAR10/100 and ImageNet on Convolutional Neural Networks.
arXiv Detail & Related papers (2024-05-06T00:58:23Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - STD-NET: Search of Image Steganalytic Deep-learning Architecture via
Hierarchical Tensor Decomposition [40.997546601209145]
STD-NET is an unsupervised deep-learning architecture search approach via hierarchical tensor decomposition for image steganalysis.
Our proposed strategy is more efficient and can remove more redundancy compared with previous steganalytic network compression methods.
arXiv Detail & Related papers (2022-06-12T03:46:08Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.