Differentiable Tree Search in Latent State Space
- URL: http://arxiv.org/abs/2401.11660v1
- Date: Mon, 22 Jan 2024 02:33:38 GMT
- Title: Differentiable Tree Search in Latent State Space
- Authors: Dixant Mittal and Wee Sun Lee
- Abstract summary: Differentiable Tree Search (DTS) is a novel neural network architecture that significantly strengthens the inductive bias.
We introduce a novel neural network architecture that significantly strengthens the inductive bias by embedding the algorithmic structure of a best-first online search algorithm.
We evaluate DTS in an offline-RL setting with a limited training data scenario on Procgen games and grid navigation task.
- Score: 17.688341927195033
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In decision-making problems with limited training data, policy functions
approximated using deep neural networks often exhibit suboptimal performance.
An alternative approach involves learning a world model from the limited data
and determining actions through online search. However, the performance is
adversely affected by compounding errors arising from inaccuracies in the
learnt world model. While methods like TreeQN have attempted to address these
inaccuracies by incorporating algorithmic structural biases into their
architectures, the biases they introduce are often weak and insufficient for
complex decision-making tasks. In this work, we introduce Differentiable Tree
Search (DTS), a novel neural network architecture that significantly
strengthens the inductive bias by embedding the algorithmic structure of a
best-first online search algorithm. DTS employs a learnt world model to conduct
a fully differentiable online search in latent state space. The world model is
jointly optimised with the search algorithm, enabling the learning of a robust
world model and mitigating the effect of model inaccuracies. We address
potential Q-function discontinuities arising from naive incorporation of
best-first search by adopting a stochastic tree expansion policy, formulating
search tree expansion as a decision-making task, and introducing an effective
variance reduction technique for the gradient computation. We evaluate DTS in
an offline-RL setting with a limited training data scenario on Procgen games
and grid navigation task, and demonstrate that DTS outperforms popular
model-free and model-based baselines.
Related papers
- Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Variational Inference for Infinitely Deep Neural Networks [0.4061135251278187]
unbounded depth neural network (UDN)
We introduce the unbounded depth neural network (UDN), an infinitely deep probabilistic model that adapts its complexity to the training data.
We study the UDN on real and synthetic data.
arXiv Detail & Related papers (2022-09-21T03:54:34Z) - Network Gradient Descent Algorithm for Decentralized Federated Learning [0.2867517731896504]
We study a fully decentralized federated learning algorithm, which is a novel descent gradient algorithm executed on a communication-based network.
In the NGD method, only statistics (e.g., parameter estimates) need to be communicated, minimizing the risk of privacy.
We find that both the learning rate and the network structure play significant roles in determining the NGD estimator's statistical efficiency.
arXiv Detail & Related papers (2022-05-06T02:53:31Z) - Multigoal-oriented dual-weighted-residual error estimation using deep
neural networks [0.0]
Deep learning is considered as a powerful tool with high flexibility to approximate functions.
Our approach is based on a posteriori error estimation in which the adjoint problem is solved for the error localization.
An efficient and easy to implement algorithm is developed to obtain a posteriori error estimate for multiple goal functionals.
arXiv Detail & Related papers (2021-12-21T16:59:44Z) - Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z) - A Novel Neural Network Training Framework with Data Assimilation [2.948167339160823]
A gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients.
The results show that the proposed training framework performed better than the gradient decent method.
arXiv Detail & Related papers (2020-10-06T11:12:23Z) - VINNAS: Variational Inference-based Neural Network Architecture Search [2.685668802278155]
We present a differentiable variational inference-based NAS method for searching sparse convolutional neural networks.
Our method finds diverse network cells, while showing state-of-the-art accuracy with up to almost 2 times fewer non-zero parameters.
arXiv Detail & Related papers (2020-07-12T21:47:35Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.