Hyperparameter Tuning is All You Need for LISTA
- URL: http://arxiv.org/abs/2110.15900v1
- Date: Fri, 29 Oct 2021 16:35:38 GMT
- Title: Hyperparameter Tuning is All You Need for LISTA
- Authors: Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin
- Abstract summary: Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network.
We show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate.
We call this new ultra-light weight network HyperLISTA.
- Score: 92.7008234085887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the
concept of unrolling an iterative algorithm and training it like a neural
network. It has had great success on sparse recovery. In this paper, we show
that adding momentum to intermediate variables in the LISTA network achieves a
better convergence rate and, in particular, the network with instance-optimal
parameters is superlinearly convergent. Moreover, our new theoretical results
lead to a practical approach of automatically and adaptively calculating the
parameters of a LISTA network layer based on its previous layers. Perhaps most
surprisingly, such an adaptive-parameter procedure reduces the training of
LISTA to tuning only three hyperparameters from data: a new record set in the
context of the recent advances on trimming down LISTA complexity. We call this
new ultra-light weight network HyperLISTA. Compared to state-of-the-art LISTA
models, HyperLISTA achieves almost the same performance on seen data
distributions and performs better when tested on unseen distributions
(specifically, those with different sparsity levels and nonzero magnitudes).
Code is available: https://github.com/VITA-Group/HyperLISTA.
Related papers
- Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Low-Rank Representations Meets Deep Unfolding: A Generalized and
Interpretable Network for Hyperspectral Anomaly Detection [41.50904949744355]
Current hyperspectral anomaly detection (HAD) benchmark datasets suffer from low resolution, simple background, and small size of the detection data.
These factors also limit the performance of the well-known low-rank representation (LRR) models in terms of robustness.
We build a new set of HAD benchmark datasets for improving the robustness of the HAD algorithm in complex scenarios, AIR-HAD for short.
arXiv Detail & Related papers (2024-02-23T14:15:58Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Hyperparameter Optimization through Neural Network Partitioning [11.6941692990626]
We propose a simple and efficient way for optimizing hyper parameters in neural networks.
Our method partitions the training data and a neural network model into $K$ data shards and parameter partitions.
We demonstrate that we can apply this objective to optimize a variety of different hyper parameters in a single training run.
arXiv Detail & Related papers (2023-04-28T11:24:41Z) - Hybrid ISTA: Unfolding ISTA With Convergence Guarantees Using Free-Form
Deep Neural Networks [50.193061099112626]
It is promising to solve linear inverse problems by unfolding iterative algorithms as deep neural networks (DNNs) with learnable parameters.
Existing ISTA-based unfolded algorithms restrict the network architectures for iterative updates with the partial weight coupling structure to guarantee convergence.
This paper is the first to provide a convergence-provable framework that enables free-form DNNs in ISTA-based unfolded algorithms.
arXiv Detail & Related papers (2022-04-25T13:17:57Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - Surrogate Model Based Hyperparameter Tuning for Deep Learning with SPOT [0.40611352512781856]
This article demonstrates how the architecture-level parameters of deep learning models that were implemented in Keras/tensorflow can be optimized.
The implementation of the tuning procedure is 100 % based on R, the software environment for statistical computing.
arXiv Detail & Related papers (2021-05-30T21:16:51Z) - Practical and sample efficient zero-shot HPO [8.41866793161234]
We provide an overview of available approaches and introduce two novel techniques to handle the problem.
The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query.
The second is for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization.
arXiv Detail & Related papers (2020-07-27T08:56:55Z) - HyperSTAR: Task-Aware Hyperparameters for Deep Networks [52.50861379908611]
HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
arXiv Detail & Related papers (2020-05-21T08:56:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.