Learning Multiclass Classifier Under Noisy Bandit Feedback
- URL: http://arxiv.org/abs/2006.03545v2
- Date: Wed, 3 Mar 2021 16:56:12 GMT
- Title: Learning Multiclass Classifier Under Noisy Bandit Feedback
- Authors: Mudit Agarwal and Naresh Manwani
- Abstract summary: We propose a novel approach to deal with noisy bandit feedback based on the unbiased estimator technique.
We show our approach's effectiveness using extensive experiments on several benchmark datasets.
- Score: 6.624726878647541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of multiclass classification with corrupted
or noisy bandit feedback. In this setting, the learner may not receive true
feedback. Instead, it receives feedback that has been flipped with some
non-zero probability. We propose a novel approach to deal with noisy bandit
feedback based on the unbiased estimator technique. We further offer a method
that can efficiently estimate the noise rates, thus providing an end-to-end
framework. The proposed algorithm enjoys a mistake bound of the order of
$O(\sqrt{T})$ in the high noise case and of the order of
$O(T^{\nicefrac{2}{3}})$ in the worst case. We show our approach's
effectiveness using extensive experiments on several benchmark datasets.
Related papers
- Trusted Multi-view Learning with Label Noise [17.458306450909316]
Multi-view learning methods often focus on improving decision accuracy while neglecting the decision uncertainty.
We propose a trusted multi-view noise refining method to solve this problem.
We empirically compare TMNR with state-of-the-art trusted multi-view learning and label noise learning baselines on 5 publicly available datasets.
arXiv Detail & Related papers (2024-04-18T06:47:30Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC)
NPC consists of a detection module and a correction module.
We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z) - Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - RoLNiP: Robust Learning Using Noisy Pairwise Comparisons [6.624726878647541]
This paper presents a robust approach for learning from noisy pairwise comparisons.
We experimentally show that the proposed approach RoLNiP outperforms the robust state-of-the-art methods for learning with noisy pairwise comparisons.
arXiv Detail & Related papers (2023-03-04T06:28:08Z) - UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise.
We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z) - Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits
with Super Heavy-Tailed Payoffs [27.636407641546914]
We propose a novel robust statistical estimator, mean of medians, which estimates a random variable by computing the empirical mean of a sequence of empirical medians.
We show that the regret bound is near-optimal even with very heavy-tailed noise.
arXiv Detail & Related papers (2021-10-26T17:30:44Z) - Learning Noise Transition Matrix from Only Noisy Labels via Total
Variation Regularization [88.91872713134342]
We propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously.
We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
arXiv Detail & Related papers (2021-02-04T05:09:18Z) - Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users'
Feedback [62.997667081978825]
We present a novel approach for considering user feedback and evaluate it using three distinct strategies.
Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
arXiv Detail & Related papers (2020-09-16T07:32:51Z) - Multi-label Contrastive Predictive Coding [125.03510235962095]
Variational mutual information (MI) estimators are widely used in unsupervised representation learning methods such as contrastive predictive coding (CPC)
We introduce a novel estimator based on a multi-label classification problem, where the critic needs to jointly identify multiple positive samples at the same time.
We show that using the same amount of negative samples, multi-label CPC is able to exceed the $log m$ bound, while still being a valid lower bound of mutual information.
arXiv Detail & Related papers (2020-07-20T02:46:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.