Noise in Classification
- URL: http://arxiv.org/abs/2010.05080v2
- Date: Fri, 13 Nov 2020 15:42:05 GMT
- Title: Noise in Classification
- Authors: Maria-Florina Balcan, Nika Haghtalab
- Abstract summary: This chapter considers the computational and statistical aspects of learning linear thresholds in presence of noise.
We discuss approaches for dealing with these negative results by exploiting natural assumptions on the data-generating process.
- Score: 32.458986097202626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This chapter considers the computational and statistical aspects of learning
linear thresholds in presence of noise. When there is no noise, several
algorithms exist that efficiently learn near-optimal linear thresholds using a
small amount of data. However, even a small amount of adversarial noise makes
this problem notoriously hard in the worst-case. We discuss approaches for
dealing with these negative results by exploiting natural assumptions on the
data-generating process.
Related papers
- Improving Noise Robustness through Abstractions and its Impact on Machine Learning [2.6563873893593826]
Noise is a fundamental problem in learning theory with huge effects in the application of Machine Learning (ML) methods.
In this paper, we propose a method to deal with noise: mitigating its effect through the use of data abstractions.
The goal is to reduce the effect of noise over the model's performance through the loss of information produced by the abstraction.
arXiv Detail & Related papers (2024-06-12T17:14:44Z) - Denoising-Aware Contrastive Learning for Noisy Time Series [35.97130925600067]
Time series self-supervised learning (SSL) aims to exploit unlabeled data for pre-training to mitigate the reliance on labels.
We propose denoising-aware contrastive learning (DECL) to mitigate the noise in the representation and automatically selects suitable denoising methods for every sample.
arXiv Detail & Related papers (2024-06-07T04:27:32Z) - Understanding the Effect of Noise in LLM Training Data with Algorithmic
Chains of Thought [0.0]
We study how noise in chain of thought impacts task performance in highly-controlled setting.
We define two types of noise: textitstatic noise, a local form of noise which is applied after the CoT trace is computed, and textitdynamic noise, a global form of noise which propagates errors in the trace as it is computed.
We find fine-tuned models are extremely robust to high levels of static noise but struggle significantly more with lower levels of dynamic noise.
arXiv Detail & Related papers (2024-02-06T13:59:56Z) - Multiclass Learning from Noisy Labels for Non-decomposable Performance Measures [15.358504449550013]
We design algorithms to learn from noisy labels for two broad classes of non-decomposable performance measures.
In both cases, we develop noise-corrected versions of the algorithms under the widely studied class-conditional noise models.
Our experiments demonstrate the effectiveness of our algorithms in handling label noise.
arXiv Detail & Related papers (2024-02-01T23:03:53Z) - Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.
We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels.
Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT)
Most de-noising methods fail to identify the hard noises.
We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Learning based signal detection for MIMO systems with unknown noise
statistics [84.02122699723536]
This paper aims to devise a generalized maximum likelihood (ML) estimator to robustly detect signals with unknown noise statistics.
In practice, there is little or even no statistical knowledge on the system noise, which in many cases is non-Gaussian, impulsive and not analyzable.
Our framework is driven by an unsupervised learning approach, where only the noise samples are required.
arXiv Detail & Related papers (2021-01-21T04:48:15Z) - Contextual Linear Bandits under Noisy Features: Towards Bayesian Oracles [61.247089049339664]
We study contextual linear bandit problems under feature uncertainty.
The optimal hypothesis can be far from the underlying realizability function, depending on the noise characteristics.
We propose an algorithm that aims at the Bayesian oracle from observed information.
arXiv Detail & Related papers (2017-03-03T21:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.