Learning from Noisy Similar and Dissimilar Data
- URL: http://arxiv.org/abs/2002.00995v1
- Date: Mon, 3 Feb 2020 19:59:16 GMT
- Title: Learning from Noisy Similar and Dissimilar Data
- Authors: Soham Dan, Han Bao, Masashi Sugiyama
- Abstract summary: We show how to learn a classifier from noisy S and D labeled data.
We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data.
- Score: 84.76686918337134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the widespread use of machine learning for classification, it becomes
increasingly important to be able to use weaker kinds of supervision for tasks
in which it is hard to obtain standard labeled data. One such kind of
supervision is provided pairwise---in the form of Similar (S) pairs (if two
examples belong to the same class) and Dissimilar (D) pairs (if two examples
belong to different classes). This kind of supervision is realistic in
privacy-sensitive domains. Although this problem has been looked at recently,
it is unclear how to learn from such supervision under label noise, which is
very common when the supervision is crowd-sourced. In this paper, we close this
gap and demonstrate how to learn a classifier from noisy S and D labeled data.
We perform a detailed investigation of this problem under two realistic noise
models and propose two algorithms to learn from noisy S-D data. We also show
important connections between learning from such pairwise supervision data and
learning from ordinary class-labeled data. Finally, we perform experiments on
synthetic and real world datasets and show our noise-informed algorithms
outperform noise-blind baselines in learning from noisy pairwise data.
Related papers
- ROG$_{PL}$: Robust Open-Set Graph Learning via Region-Based Prototype
Learning [52.60434474638983]
We propose a unified framework named ROG$_PL$ to achieve robust open-set learning on complex noisy graph data.
The framework consists of two modules, i.e., denoising via label propagation and open-set prototype learning via regions.
To the best of our knowledge, the proposed ROG$_PL$ is the first robust open-set node classification method for graph data with complex noise.
arXiv Detail & Related papers (2024-02-28T17:25:06Z) - Multiclass Learning from Noisy Labels for Non-decomposable Performance Measures [15.358504449550013]
We design algorithms to learn from noisy labels for two broad classes of non-decomposable performance measures.
In both cases, we develop noise-corrected versions of the algorithms under the widely studied class-conditional noise models.
Our experiments demonstrate the effectiveness of our algorithms in handling label noise.
arXiv Detail & Related papers (2024-02-01T23:03:53Z) - Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers [3.481985817302898]
A concern about studying supervised denoising is that one might not always have noiseless training data from the test distribution.
Motivated by this, we study supervised denoising and noisy-input regression under distribution shift.
arXiv Detail & Related papers (2023-05-26T22:41:40Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Learning From Long-Tailed Data With Noisy Labels [0.0]
Class imbalance and noisy labels are the norm in many large-scale classification datasets.
We present a simple two-stage approach based on recent advances in self-supervised learning.
We find that self-supervised learning approaches are effectively able to cope with severe class imbalance.
arXiv Detail & Related papers (2021-08-25T07:45:40Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.