Learning to Stop While Learning to Predict
- URL: http://arxiv.org/abs/2006.05082v1
- Date: Tue, 9 Jun 2020 07:22:01 GMT
- Title: Learning to Stop While Learning to Predict
- Authors: Xinshi Chen, Hanjun Dai, Yu Li, Xin Gao, Le Song
- Abstract summary: Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
- Score: 85.7136203122784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a recent surge of interest in designing deep architectures based on
the update steps in traditional algorithms, or learning neural networks to
improve and replace traditional algorithms. While traditional algorithms have
certain stopping criteria for outputting results at different iterations, many
algorithm-inspired deep models are restricted to a ``fixed-depth'' for all
inputs. Similar to algorithms, the optimal depth of a deep architecture may be
different for different input instances, either to avoid ``over-thinking'', or
because we want to compute less for operations converged already. In this
paper, we tackle this varying depth problem using a steerable architecture,
where a feed-forward deep model and a variational stopping policy are learned
together to sequentially determine the optimal number of layers for each input
instance. Training such architecture is very challenging. We provide a
variational Bayes perspective and design a novel and effective training
procedure which decomposes the task into an oracle model learning stage and an
imitation stage. Experimentally, we show that the learned deep model along with
the stopping policy improves the performances on a diverse set of tasks,
including learning sparse recovery, few-shot meta learning, and computer vision
tasks.
Related papers
- Training Neural Networks with Internal State, Unconstrained
Connectivity, and Discrete Activations [66.53734987585244]
True intelligence may require the ability of a machine learning model to manage internal state.
We show that we have not yet discovered the most effective algorithms for training such models.
We present one attempt to design such a training algorithm, applied to an architecture with binary activations and only a single matrix of weights.
arXiv Detail & Related papers (2023-12-22T01:19:08Z) - Multi-Objective Optimization for Sparse Deep Multi-Task Learning [0.0]
We present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs)
Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with particular focus on Deep Multi-Task models.
arXiv Detail & Related papers (2023-08-23T16:42:27Z) - A Generalist Neural Algorithmic Learner [18.425083543441776]
We build a single graph neural network processor capable of learning to execute a wide range of algorithms.
We show that it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime.
arXiv Detail & Related papers (2022-09-22T16:41:33Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Learn to Adapt for Monocular Depth Estimation [17.887575611570394]
We propose an adversarial depth estimation task and train the model in the pipeline of meta-learning.
Our method adapts well to new datasets after few training steps during the test procedure.
arXiv Detail & Related papers (2022-03-26T06:49:22Z) - Meta Navigator: Search for a Good Adaptation Policy for Few-shot
Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data.
Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios.
We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Structure preserving deep learning [1.2263454117570958]
deep learning has risen to the foreground as a topic of massive interest.
There are multiple challenging mathematical problems involved in applying deep learning.
A growing effort to mathematically understand the structure in existing deep learning methods.
arXiv Detail & Related papers (2020-06-05T10:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.