Self-Supervised Contextual Bandits in Computer Vision
- URL: http://arxiv.org/abs/2003.08485v1
- Date: Wed, 18 Mar 2020 22:06:34 GMT
- Title: Self-Supervised Contextual Bandits in Computer Vision
- Authors: Aniket Anand Deshmukh, Abhimanu Kumar, Levi Boyles, Denis Charles,
Eren Manavoglu, Urun Dogan
- Abstract summary: Contextual bandits are a common problem faced by machine learning practitioners.
We propose a novel approach to tackle this issue by combining a contextual bandit objective with a self supervision objective.
Our results on eight popular computer vision datasets show substantial gains in cumulative reward.
- Score: 4.165029665035158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextual bandits are a common problem faced by machine learning
practitioners in domains as diverse as hypothesis testing to product
recommendations. There have been a lot of approaches in exploiting rich data
representations for contextual bandit problems with varying degree of success.
Self-supervised learning is a promising approach to find rich data
representations without explicit labels. In a typical self-supervised learning
scheme, the primary task is defined by the problem objective (e.g. clustering,
classification, embedding generation etc.) and the secondary task is defined by
the self-supervision objective (e.g. rotation prediction, words in
neighborhood, colorization, etc.). In the usual self-supervision, we learn
implicit labels from the training data for a secondary task. However, in the
contextual bandit setting, we don't have the advantage of getting implicit
labels due to lack of data in the initial phase of learning. We provide a novel
approach to tackle this issue by combining a contextual bandit objective with a
self supervision objective. By augmenting contextual bandit learning with
self-supervision we get a better cumulative reward. Our results on eight
popular computer vision datasets show substantial gains in cumulative reward.
We provide cases where the proposed scheme doesn't perform optimally and give
alternative methods for better learning in these cases.
Related papers
- One-bit Supervision for Image Classification: Problem, Solution, and
Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification.
We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm.
In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z) - From Weakly Supervised Learning to Active Learning [1.52292571922932]
This thesis is motivated by the question: can we derive a more generic framework than the one of supervised learning?
We model weak supervision as giving, rather than a unique target, a set of target candidates.
We argue that one should look for an optimistic'' function that matches most of the observations. This allows us to derive a principle to disambiguate partial labels.
arXiv Detail & Related papers (2022-09-23T14:55:43Z) - New Intent Discovery with Pre-training and Contrastive Learning [21.25371293641141]
New intent discovery aims to uncover novel intent categories from user utterances to expand the set of supported intent classes.
Existing approaches typically rely on a large amount of labeled utterances.
We propose a new contrastive loss to exploit self-supervisory signals in unlabeled data for clustering.
arXiv Detail & Related papers (2022-05-25T17:07:25Z) - Using Self-Supervised Pretext Tasks for Active Learning [7.214674613451605]
We propose a novel active learning approach that utilizes self-supervised pretext tasks and a unique data sampler to select data that are both difficult and representative.
The pretext task learner is trained on the unlabeled set, and the unlabeled data are sorted and grouped into batches by their pretext task losses.
In each iteration, the main task model is used to sample the most uncertain data in a batch to be annotated.
arXiv Detail & Related papers (2022-01-19T07:58:06Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - Contextual Bandit with Missing Rewards [27.066965426355257]
We consider a novel variant of the contextual bandit problem where the reward associated with each context-based decision may not always be observed.
This new problem is motivated by certain online settings including clinical trial and ad recommendation applications.
We propose to combine the standard contextual bandit approach with an unsupervised learning mechanism such as clustering.
arXiv Detail & Related papers (2020-07-13T13:29:51Z) - How Useful is Self-Supervised Pretraining for Visual Tasks? [133.1984299177874]
We evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks.
Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows.
arXiv Detail & Related papers (2020-03-31T16:03:22Z) - A survey of bias in Machine Learning through the prism of Statistical
Parity for the Adult Data Set [5.277804553312449]
We show the importance of understanding how a bias can be introduced into automatic decisions.
We first present a mathematical framework for the fair learning problem, specifically in the binary classification setting.
We then propose to quantify the presence of bias by using the standard Disparate Impact index on the real and well-known Adult income data set.
arXiv Detail & Related papers (2020-03-31T14:48:36Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.