Related papers: Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

URL: http://arxiv.org/abs/2202.13227v1
Date: Sat, 26 Feb 2022 20:54:55 GMT
Title: Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
Authors: Runzhe Wan, Lin Ge, Rui Song
Abstract summary: We propose a unified meta-learning framework for a class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action spaces, and robust to the specification of the generalization model.
Score: 11.778985277618354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action spaces, and robust to the specification of the generalization model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Both theoretical analysis and numerical results support the usefulness of the proposed method.

Related papers

A unifying framework for generalised Bayesian online learning in non-stationary environments [44.30046484614902]
We call the framework BONE, which stands for generalised (B)ayesian (O)nline learning in (N)on-stationary (E)nvironments. BONE provides a common structure to tackle a variety of problems, including online continual learning, prequential forecasting, and contextual bandits.
arXiv Detail & Related papers (2024-11-15T12:52:02Z)
Discrete Choice Multi-Armed Bandits [0.0]
This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. We furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. We introduce a novel family of adversarial multiarmed bandit algorithms, drawing inspiration from the generalized nested logit models.
arXiv Detail & Related papers (2023-10-01T03:41:04Z)
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs [81.22050011503933]
We show that our model answers queries requiring complex reasoning patterns more effectively than existing KG completion algorithms. The proposed model outperforms or performs competitively with state-of-the-art models on several KBQA benchmarks.
arXiv Detail & Related papers (2022-02-22T01:34:35Z)
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models [7.458639397686894]
How to explore efficiently is a central problem in multi-armed bandits. We introduce the metadata-based multi-task bandit problem. We propose to capture task relations through the lens of Bayesian hierarchical models.
arXiv Detail & Related papers (2021-08-13T22:45:05Z)
A Simple Unified Framework for High Dimensional Bandit Problems [33.139925285802825]
We present a general analysis framework for the regret upper bound of our algorithm. We show that our algorithm can be applied to different high dimensional bandit problems.
arXiv Detail & Related papers (2021-02-18T21:35:32Z)
Captum: A unified and generic model interpretability library for PyTorch [49.72749684393332]
We introduce a novel, unified, open-source model interpretability library for PyTorch. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms. It can be used for both classification and non-classification models.
arXiv Detail & Related papers (2020-09-16T18:57:57Z)
Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z)
Multi-view Orthonormalized Partial Least Squares: Regularizations and Deep Extensions [8.846165479467324]
We establish a family of subspace-based learning method for multi-view learning using the least squares as the fundamental basis. We propose a unified multi-view learning framework to learn a classifier over a common latent space shared by all views.
arXiv Detail & Related papers (2020-07-09T19:00:39Z)
Influence Diagram Bandits: Variational Thompson Sampling for Structured Bandit Problems [40.957688390621385]
Our framework captures complex statistical dependencies between actions, latent variables, and observations. We develop novel online learning algorithms that learn to act efficiently in our models.
arXiv Detail & Related papers (2020-07-09T16:25:40Z)
A general framework for defining and optimizing robustness [74.67016173858497]
We propose a rigorous and flexible framework for defining different types of robustness properties for classifiers. Our concept is based on postulates that robustness of a classifier should be considered as a property that is independent of accuracy. We develop a very general robustness framework that is applicable to any type of classification model.
arXiv Detail & Related papers (2020-06-19T13:24:20Z)
An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data. In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem. We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies [76.83991682238666]
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs) We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching.
arXiv Detail & Related papers (2020-02-12T17:43:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.