Self-refining of Pseudo Labels for Music Source Separation with Noisy
Labeled Data
- URL: http://arxiv.org/abs/2307.12576v1
- Date: Mon, 24 Jul 2023 07:47:21 GMT
- Title: Self-refining of Pseudo Labels for Music Source Separation with Noisy
Labeled Data
- Authors: Junghyun Koo, Yunkee Chae, Chang-Bin Jeon, Kyogu Lee
- Abstract summary: Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks.
This paper introduces an automated technique for refining the labels in a partially mislabeled dataset.
Our proposed self-refining technique, employed with a noisy-labeled dataset, results in only a 1% accuracy degradation in multi-label instrument recognition.
- Score: 15.275949700129797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music source separation (MSS) faces challenges due to the limited
availability of correctly-labeled individual instrument tracks. With the push
to acquire larger datasets to improve MSS performance, the inevitability of
encountering mislabeled individual instrument tracks becomes a significant
challenge to address. This paper introduces an automated technique for refining
the labels in a partially mislabeled dataset. Our proposed self-refining
technique, employed with a noisy-labeled dataset, results in only a 1% accuracy
degradation in multi-label instrument recognition compared to a classifier
trained on a clean-labeled dataset. The study demonstrates the importance of
refining noisy-labeled data in MSS model training and shows that utilizing the
refined dataset leads to comparable results derived from a clean-labeled
dataset. Notably, upon only access to a noisy dataset, MSS models trained on a
self-refined dataset even outperform those trained on a dataset refined with a
classifier trained on clean labels.
Related papers
- Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond [38.89457061559469]
We propose an innovative methodology that automates dataset creation with negligible cost and high efficiency.
We provide open-source software that incorporates existing methods for label error detection, robust learning under noisy and biased data.
We design three benchmark datasets focused on label noise detection, label noise learning, and class-imbalanced learning.
arXiv Detail & Related papers (2024-08-21T04:45:12Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Learning in the Wild: Towards Leveraging Unlabeled Data for Effectively
Tuning Pre-trained Code Models [38.7352992942213]
We propose a novel approach named HINT to improve pre-trained code models with large-scale unlabeled datasets.
HINT includes two main modules: HybrId pseudo-labeled data selection and Noise-tolerant Training.
The experimental results show that HINT can better leverage those unlabeled data in a task-specific way.
arXiv Detail & Related papers (2024-01-02T06:39:00Z) - An Empirical Study of Automated Mislabel Detection in Real World Vision
Datasets [3.123276402480922]
We develop strategies to effectively implement mislabeled images in real world datasets.
With careful design of the approach, we find that mislabel removal leads per-class performance improvements of up to 8%.
arXiv Detail & Related papers (2023-12-02T19:33:42Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Learning from Training Dynamics: Identifying Mislabeled Data Beyond
Manually Designed Features [43.41573458276422]
We introduce a novel learning-based solution, leveraging a noise detector, instanced by an LSTM network.
The proposed method trains the noise detector in a supervised manner using the dataset with synthesized label noises.
Results show that the proposed method precisely detects mislabeled samples on various datasets without further adaptation.
arXiv Detail & Related papers (2022-12-19T09:39:30Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning.
It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model.
It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.