AutoLR: Layer-wise Pruning and Auto-tuning of Learning Rates in
Fine-tuning of Deep Networks
- URL: http://arxiv.org/abs/2002.06048v3
- Date: Mon, 4 Jan 2021 01:41:13 GMT
- Title: AutoLR: Layer-wise Pruning and Auto-tuning of Learning Rates in
Fine-tuning of Deep Networks
- Authors: Youngmin Ro, Jin Young Choi
- Abstract summary: Existing fine-tuning methods use a single learning rate over all layers.
We propose an algorithm that improves fine-tuning performance and reduces network complexity.
- Score: 13.761920032156082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing fine-tuning methods use a single learning rate over all layers. In
this paper, first, we discuss that trends of layer-wise weight variations by
fine-tuning using a single learning rate do not match the well-known notion
that lower-level layers extract general features and higher-level layers
extract specific features. Based on our discussion, we propose an algorithm
that improves fine-tuning performance and reduces network complexity through
layer-wise pruning and auto-tuning of layer-wise learning rates. The proposed
algorithm has verified the effectiveness by achieving state-of-the-art
performance on the image retrieval benchmark datasets (CUB-200, Cars-196,
Stanford online product, and Inshop). Code is available at
https://github.com/youngminPIL/AutoLR.
Related papers
- The Unreasonable Ineffectiveness of the Deeper Layers [5.984361440126354]
We study a simple layer-pruning strategy for popular families of open-weight pretrained LLMs.
We find minimal degradation of performance until after a large fraction of the layers are removed.
From a scientific perspective, the robustness of these LLMs to the deletion of layers implies either that current pretraining methods are not properly leveraging the parameters in the deeper layers of the network or that the shallow layers play a critical role in storing knowledge.
arXiv Detail & Related papers (2024-03-26T17:20:04Z) - RankDNN: Learning to Rank for Few-shot Learning [70.49494297554537]
This paper introduces a new few-shot learning pipeline that casts relevance ranking for image retrieval as binary ranking relation classification.
It provides a new perspective on few-shot learning and is complementary to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-28T13:59:31Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - Exploiting Explainable Metrics for Augmented SGD [43.00691899858408]
There are several unanswered questions about how learning under optimization really works and why certain strategies are better than others.
We propose new explainability metrics that measure the redundant information in a network's layers.
We then exploit these metrics to augment the Gradient Descent (SGD) by adaptively adjusting the learning rate in each layer to improve generalization performance.
arXiv Detail & Related papers (2022-03-31T00:16:44Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Layer-adaptive sparsity for the Magnitude-based Pruning [88.37510230946478]
We propose a novel importance score for global pruning, coined layer-adaptive magnitude-based pruning (LAMP) score.
LAMP consistently outperforms popular existing schemes for layerwise sparsity selection.
arXiv Detail & Related papers (2020-10-15T09:14:02Z) - Sparse Coding Driven Deep Decision Tree Ensembles for Nuclear
Segmentation in Digital Pathology Images [15.236873250912062]
We propose an easily trained yet powerful representation learning approach with performance highly competitive to deep neural networks in a digital pathology image segmentation task.
The method, called sparse coding driven deep decision tree ensembles that we abbreviate as ScD2TE, provides a new perspective on representation learning.
arXiv Detail & Related papers (2020-08-13T02:59:31Z) - Online Sequential Extreme Learning Machines: Features Combined From
Hundreds of Midlayers [0.0]
In this paper, we develop an algorithm called hierarchal online sequential learning algorithm (H-OS-ELM)
The algorithm can learn chunk by chunk with fixed or varying block size.
arXiv Detail & Related papers (2020-06-12T00:50:04Z) - DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning.
Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers.
Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.