Online Limited Memory Neural-Linear Bandits with Likelihood Matching
- URL: http://arxiv.org/abs/2102.03799v1
- Date: Sun, 7 Feb 2021 14:19:07 GMT
- Title: Online Limited Memory Neural-Linear Bandits with Likelihood Matching
- Authors: Ofir Nabati, Tom Zahavy and Shie Mannor
- Abstract summary: We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
- Score: 53.18698496031658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study neural-linear bandits for solving problems where both exploration
and representation learning play an important role. Neural-linear bandits
leverage the representation power of Deep Neural Networks (DNNs) and combine it
with efficient exploration mechanisms designed for linear contextual bandits on
top of the last hidden layer. A recent analysis of DNNs in the "infinite-width"
regime suggests that when these models are trained with gradient descent the
optimal solution is close to the initialization point and the DNN can be viewed
as a kernel machine. As a result, it is possible to exploit linear exploration
algorithms on top of a DNN via the kernel construction. The problem is that in
practice the kernel changes during the learning process and the agent's
performance degrades. This can be resolved by recomputing new uncertainty
estimations with stored data. Nevertheless, when the buffer's size is limited,
a phenomenon called catastrophic forgetting emerges. Instead, we propose a
likelihood matching algorithm that is resilient to catastrophic forgetting and
is completely online. We perform simulations on a variety of datasets and
observe that our algorithm achieves comparable performance to the unlimited
memory approach while exhibits resilience to catastrophic forgetting.
Related papers
- Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness [172.61581010141978]
Certifiable robustness is a desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios.
We propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity.
arXiv Detail & Related papers (2022-06-15T22:42:29Z) - FFNB: Forgetting-Free Neural Blocks for Deep Continual Visual Learning [14.924672048447338]
We devise a dynamic network architecture for continual learning based on a novel forgetting-free neural block (FFNB)
Training FFNB features on new tasks is achieved using a novel procedure that constrains the underlying parameters in the null-space of the previous tasks.
arXiv Detail & Related papers (2021-11-22T17:23:34Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - A unified framework for Hamiltonian deep neural networks [3.0934684265555052]
Training deep neural networks (DNNs) can be difficult due to vanishing/exploding gradients during weight optimization.
We propose a class of DNNs stemming from the time discretization of Hamiltonian systems.
The proposed Hamiltonian framework, besides encompassing existing networks inspired by marginally stable ODEs, allows one to derive new and more expressive architectures.
arXiv Detail & Related papers (2021-04-27T13:20:24Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - A Novel Neural Network Training Framework with Data Assimilation [2.948167339160823]
A gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients.
The results show that the proposed training framework performed better than the gradient decent method.
arXiv Detail & Related papers (2020-10-06T11:12:23Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - A Neural Network Approach for Online Nonlinear Neyman-Pearson
Classification [3.6144103736375857]
We propose a novel Neyman-Pearson (NP) classifier that is both online and nonlinear as the first time in the literature.
The proposed classifier operates on a binary labeled data stream in an online manner, and maximizes the detection power about a user-specified and controllable false positive rate.
Our algorithm is appropriate for large scale data applications and provides a decent false positive rate controllability with real time processing.
arXiv Detail & Related papers (2020-06-14T20:00:25Z) - Fractional Deep Neural Network via Constrained Optimization [0.0]
This paper introduces a novel algorithmic framework for a deep neural network (DNN)
Fractional-DNN can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE)
arXiv Detail & Related papers (2020-04-01T21:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.