Thompson Sampling with a Mixture Prior
- URL: http://arxiv.org/abs/2106.05608v1
- Date: Thu, 10 Jun 2021 09:21:07 GMT
- Title: Thompson Sampling with a Mixture Prior
- Authors: Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh,
Craig Boutilier
- Abstract summary: We study Thompson sampling (TS) in online decision-making problems where the uncertain environment is sampled from a mixture distribution.
We develop a novel, general technique for analyzing the regret of TS with such priors.
- Score: 59.211830005673896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study Thompson sampling (TS) in online decision-making problems where the
uncertain environment is sampled from a mixture distribution. This is relevant
to multi-task settings, where a learning agent is faced with different classes
of problems. We incorporate this structure in a natural way by initializing TS
with a mixture prior -- dubbed MixTS -- and develop a novel, general technique
for analyzing the regret of TS with such priors. We apply this technique to
derive Bayes regret bounds for MixTS in both linear bandits and tabular Markov
decision processes (MDPs). Our regret bounds reflect the structure of the
problem and depend on the number of components and confidence width of each
component of the prior. Finally, we demonstrate the empirical effectiveness of
MixTS in both synthetic and real-world experiments.
Related papers
- ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups [17.672299431705262]
Using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases in dataset labels.
We propose a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized.
ProxiMix keeps both pairwise and proximity relationships for fairer data augmentation.
arXiv Detail & Related papers (2024-10-02T00:47:03Z) - SUMix: Mixup with Semantic and Uncertain Information [41.99721365685618]
Mixup data augmentation approaches have been applied for various tasks of deep learning.
We propose a novel approach named SUMix to learn the mixing ratio as well as the uncertainty for the mixed samples during the training process.
arXiv Detail & Related papers (2024-07-10T16:25:26Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - PowMix: A Versatile Regularizer for Multimodal Sentiment Analysis [71.8946280170493]
This paper introduces PowMix, a versatile embedding space regularizer that builds upon the strengths of unimodal mixing-based regularization approaches.
PowMix is integrated before the fusion stage of multimodal architectures and facilitates intra-modal mixing, such as mixing text with text, to act as a regularizer.
arXiv Detail & Related papers (2023-12-19T17:01:58Z) - Image Processing and Machine Learning for Hyperspectral Unmixing: An Overview and the HySUPP Python Package [80.11512905623417]
Unmixing estimates the fractional abundances of the endmembers within the pixel.
This paper provides an overview of advanced and conventional unmixing approaches.
We compare the performance of the unmixing techniques on three simulated and two real datasets.
arXiv Detail & Related papers (2023-08-18T08:10:41Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - Unsupervised Source Separation via Self-Supervised Training [0.913755431537592]
We introduce two novel unsupervised (blind) source separation methods, which involve self-supervised training from single-channel two-source speech mixtures.
Our first method employs permutation invariant training (PIT) to separate artificially-generated mixtures back into the original mixtures.
We improve upon this first method by creating mixtures of source estimates and employing PIT to separate these new mixtures in a cyclic fashion.
We show that MixPIT outperforms a common baseline (MixIT) on our small dataset (SC09Mix), and they have comparable performance on a standard dataset (LibriMix)
arXiv Detail & Related papers (2022-02-08T14:02:50Z) - An Empirical Study of the Effects of Sample-Mixing Methods for Efficient
Training of Generative Adversarial Networks [0.0]
It is well-known that training of generative adversarial networks (GANs) requires huge iterations before the generator's providing good-quality samples.
We investigated the effect of sample mixing methods, that is, Mixup, CutMix, and SRMix, to alleviate this problem.
arXiv Detail & Related papers (2021-04-08T06:40:23Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.