Related papers: Noise-Resistant Deep Metric Learning with Probabilistic Instance Filtering

Noise-Resistant Deep Metric Learning with Probabilistic Instance Filtering

URL: http://arxiv.org/abs/2108.01431v1
Date: Tue, 3 Aug 2021 12:15:25 GMT
Title: Noise-Resistant Deep Metric Learning with Probabilistic Instance Filtering
Authors: Chang Liu, Han Yu, Boyang Li, Zhiqi Shen, Zhanning Gao, Peiran Ren, Xuansong Xie, Lizhen Cui, Chunyan Miao
Abstract summary: Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks. We propose Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML. PRISM calculates the probability of a label being clean, and filters out potentially noisy samples.
Score: 59.286567680389766
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks. Cleaning data manually is labour-intensive and time-consuming. Previous research mostly focuses on enhancing classification models against noisy labels, while the robustness of deep metric learning (DML) against noisy labels remains less well-explored. In this paper, we bridge this important gap by proposing Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML. PRISM calculates the probability of a label being clean, and filters out potentially noisy samples. Specifically, we propose three methods to calculate this probability: 1) Average Similarity Method (AvgSim), which calculates the average similarity between potentially noisy data and clean data; 2) Proxy Similarity Method (ProxySim), which replaces the centers maintained by AvgSim with the proxies trained by proxy-based method; and 3) von Mises-Fisher Distribution Similarity (vMF-Sim), which estimates a von Mises-Fisher distribution for each data class. With such a design, the proposed approach can deal with challenging DML situations in which the majority of the samples are noisy. Extensive experiments on both synthetic and real-world noisy dataset show that the proposed approach achieves up to 8.37% higher Precision@1 compared with the best performing state-of-the-art baseline approaches, within reasonable training time.

Related papers

Combating Noisy Labels through Fostering Self- and Neighbor-Consistency [120.4394402099635]
Label noise is pervasive in various real-world scenarios, posing challenges in supervised deep learning.<n>We propose a noise-robust method named Jo-SNC (textbfJoint sample selection and model regularization based on textbfSelf- and textbfNeighbor-textbfConsistency)<n>We design a self-adaptive, data-driven thresholding scheme to adjust per-class selection thresholds.
arXiv Detail & Related papers (2026-01-19T07:55:29Z)
Benchmarking noisy label detection methods [0.3154269505086154]
Label noise is a common problem in real-world datasets, affecting both model training and validation.<n>We perform a comprehensive benchmark of detection methods by decomposing them into three fundamental components.<n>We identify that in-sample information gathering using average probability aggregation combined with the logit margin achieves the best results.
arXiv Detail & Related papers (2025-10-17T20:55:26Z)
Detect and Correct: A Selective Noise Correction Method for Learning with Noisy Labels [14.577138753507203]
Falsely annotated samples, also known as noisy labels, can significantly harm the performance of deep learning models.<n>Two main approaches for learning with noisy labels are global noise estimation and data filtering.<n>Our method identifies potentially noisy samples based on their loss distribution.<n>We then apply a selection process to separate noisy and clean samples and learn a noise transition matrix to correct the loss for noisy samples while leaving the clean data unaffected.
arXiv Detail & Related papers (2025-05-19T16:49:27Z)
Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching. By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously. Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z)
Federated Learning with Instance-Dependent Noisy Label [6.093214616626228]
FedBeat aims to build a global statistically consistent classifier using the IDN transition matrix (IDNTM) Experiments conducted on CIFAR-10 and SVHN verify that the proposed method significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-16T05:08:02Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels [56.81761908354718]
We propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels. Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline. We further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data.
arXiv Detail & Related papers (2023-01-02T07:13:28Z)
Instance-dependent Label Distribution Estimation for Learning with Label Noise [20.479674500893303]
Noise transition matrix (NTM) estimation is a promising approach for learning with label noise. We propose an Instance-dependent Label Distribution Estimation (ILDE) method to learn from noisy labels for image classification. Our results indicate that the proposed ILDE method outperforms all competing methods, no matter whether the noise is synthetic or real noise.
arXiv Detail & Related papers (2022-12-16T10:13:25Z)
Neighborhood Collective Estimation for Noisy Label Identification and Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels. Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias. We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z)
Noise-resistant Deep Metric Learning with Ranking-based Instance Selection [59.286567680389766]
We propose a noise-resistant training technique for DML, which we name Probabilistic Ranking-based Instance Selection with Memory (PRISM) PRISM identifies noisy data in a minibatch using average similarity against image features extracted from several previous versions of the neural network. To alleviate the high computational cost brought by the memory bank, we introduce an acceleration method that replaces individual data points with the class centers.
arXiv Detail & Related papers (2021-03-30T03:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.