Parameter Efficient Deep Neural Networks with Bilinear Projections
- URL: http://arxiv.org/abs/2011.01391v1
- Date: Tue, 3 Nov 2020 00:17:24 GMT
- Title: Parameter Efficient Deep Neural Networks with Bilinear Projections
- Authors: Litao Yu, Yongsheng Gao, Jun Zhou, Jian Zhang
- Abstract summary: We address the parameter redundancy problem in deep neural networks (DNNs) by replacing conventional full projections with bilinear projections.
For a fully-connected layer with $D$ input nodes and $D$ output nodes, applying bilinear projection can reduce the model space complexity.
Experiments on four benchmark datasets show that applying the proposed bilinear projection to deep neural networks can achieve even higher accuracies.
- Score: 16.628045837101237
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research on deep neural networks (DNNs) has primarily focused on
improving the model accuracy. Given a proper deep learning framework, it is
generally possible to increase the depth or layer width to achieve a higher
level of accuracy. However, the huge number of model parameters imposes more
computational and memory usage overhead and leads to the parameter redundancy.
In this paper, we address the parameter redundancy problem in DNNs by replacing
conventional full projections with bilinear projections. For a fully-connected
layer with $D$ input nodes and $D$ output nodes, applying bilinear projection
can reduce the model space complexity from $\mathcal{O}(D^2)$ to
$\mathcal{O}(2D)$, achieving a deep model with a sub-linear layer size.
However, structured projection has a lower freedom of degree compared to the
full projection, causing the under-fitting problem. So we simply scale up the
mapping size by increasing the number of output channels, which can keep and
even boosts the model accuracy. This makes it very parameter-efficient and
handy to deploy such deep models on mobile systems with memory limitations.
Experiments on four benchmark datasets show that applying the proposed bilinear
projection to deep neural networks can achieve even higher accuracies than
conventional full DNNs, while significantly reduces the model size.
Related papers
- EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Investigating the Relationship Between Dropout Regularization and Model
Complexity in Neural Networks [0.0]
Dropout Regularization serves to reduce variance in Deep Learning models.
We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks.
We build neural networks that predict the optimal dropout rate given the number of hidden units in each dense layer.
arXiv Detail & Related papers (2021-08-14T23:49:33Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation [14.81943833870932]
We present an improvedDepthNet, HR-Depth, with two effective strategies.
Using Resnet-18 as the encoder, HR-Depth surpasses all pre-vious state-of-the-art(SoTA) methods with the least param-eters at both high and low resolution.
arXiv Detail & Related papers (2020-12-14T09:15:15Z) - Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural
Networks [0.0]
We propose a new pruning method called Pruning for Quantization (PfQ) which removes the filters that disturb the fine-tuning of the DNN.
Experiments using well-known models and datasets confirmed that the proposed method achieves higher performance with a similar model size.
arXiv Detail & Related papers (2020-11-13T04:12:54Z) - ${\rm N{\small ode}S{\small ig}}$: Random Walk Diffusion meets Hashing
for Scalable Graph Embeddings [7.025709586759654]
$rm Nsmall odeSsmall ig$ is a scalable embedding model that computes binary node representations.
$rm Nsmall odeSsmall ig$ exploits random walk diffusion probabilities via stable random projection hashing.
arXiv Detail & Related papers (2020-10-01T09:07:37Z) - Local Grid Rendering Networks for 3D Object Detection in Point Clouds [98.02655863113154]
CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid.
We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently.
We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2020-07-04T13:57:43Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.