Related papers: Bridging Weakly-Supervised Learning and VLM Distillation: Noisy Partial Label Learning for Efficient Downstream Adaptation

Bridging Weakly-Supervised Learning and VLM Distillation: Noisy Partial Label Learning for Efficient Downstream Adaptation

URL: http://arxiv.org/abs/2506.03229v2
Date: Mon, 10 Nov 2025 17:19:19 GMT
Title: Bridging Weakly-Supervised Learning and VLM Distillation: Noisy Partial Label Learning for Efficient Downstream Adaptation
Authors: Qian-Wei Wang, Yuqiu Xie, Letian Zhang, Zimo Liu, Shu-Tao Xia,
Abstract summary: In noisy partial label learning (NPLL), each training sample is associated with a set of candidate labels annotated by multiple noisy annotators.<n>This paper focuses on learning from partial labels annotated by pre-trained vision-language models.<n>It proposes an innovative collaborative consistency regularization (Co-Reg) method.
Score: 51.67328507400985
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In the context of noisy partial label learning (NPLL), each training sample is associated with a set of candidate labels annotated by multiple noisy annotators. With the emergence of high-performance pre-trained vision-language models (VLMs) such as CLIP, LLaVA and GPT-4V, the direction of using these models to replace time-consuming manual annotation workflows and achieve ``manual-annotation-free" training for downstream tasks has become a highly promising research avenue. This paper focuses on learning from noisy partial labels annotated by pre-trained VLMs and proposes an innovative collaborative consistency regularization (Co-Reg) method. Unlike the symmetric noise primarily addressed in traditional noisy label learning, the noise generated by pre-trained models is instance-dependent, embodying the underlying patterns of the pre-trained models themselves, which significantly increases the learning difficulty for the model. To address this, we simultaneously train two neural networks that implement collaborative purification of training labels through a ``Co-Pseudo-Labeling" mechanism, while enforcing consistency regularization constraints in both the label space and feature representation space. Specifically, we construct multiple anti-overfitting mechanisms that efficiently mine latent information from noisy partially labeled samples including alternating optimization of contrastive feature representations and pseudo-labels, as well as maintaining prototypical class vectors in the shared feature space.

Related papers

BRIDLE: Generalized Self-supervised Learning with Quantization [15.121857164574704]
Self-supervised learning has been a powerful approach for learning meaningful representations from unlabeled data across various domains.<n>Inspired by BERT's success in capturing deep bidirectional contexts in natural language processing, similar frameworks have been adapted to other modalities such as audio.<n>We introduce BRIDLE, a self-supervised pretraining framework that incorporates residual quantization into the bidirectional training process.
arXiv Detail & Related papers (2025-02-04T08:54:06Z)
Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels. By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data. The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z)
Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages. In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z)
Adaptive Integration of Partial Label Learning and Negative Learning for Enhanced Noisy Label Learning [23.847160480176697]
We propose a simple yet powerful idea called textbfNPN, which revolutionizes textbfNoisy label learning. We generate reliable complementary labels using all non-candidate labels for NL to enhance model robustness through indirect supervision. Experiments conducted on both synthetically corrupted and real-world noisy datasets demonstrate the superiority of NPN compared to other state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-12-15T03:06:19Z)
Asymmetric Co-teaching with Multi-view Consensus for Noisy Label Learning [15.690502285538411]
We introduce our noisy-label learning approach, called Asymmetric Co-teaching (AsyCo) AsyCo produces more consistent divergent results of the co-teaching models. Experiments on synthetic and real-world noisy-label datasets show that AsyCo improves over current SOTA methods.
arXiv Detail & Related papers (2023-01-01T04:10:03Z)
Transductive CLIP with Class-Conditional Contrastive Learning [68.51078382124331]
We propose Transductive CLIP, a novel framework for learning a classification network with noisy labels from scratch. A class-conditional contrastive learning mechanism is proposed to mitigate the reliance on pseudo labels. ensemble labels is adopted as a pseudo label updating strategy to stabilize the training of deep neural networks with noisy labels.
arXiv Detail & Related papers (2022-06-13T14:04:57Z)
Context-based Virtual Adversarial Training for Text Classification with Noisy Labels [1.9508698179748525]
We propose context-based virtual adversarial training (ConVAT) to prevent a text classifier from overfitting to noisy labels. Unlike the previous works, the proposed method performs the adversarial training at the context level rather than the inputs. We conduct extensive experiments on four text classification datasets with two types of label noises.
arXiv Detail & Related papers (2022-05-29T14:19:49Z)
Contrastive Test-Time Adaptation [83.73506803142693]
We propose a novel way to leverage self-supervised contrastive learning to facilitate target feature learning. We produce pseudo labels online and refine them via soft voting among their nearest neighbors in the target feature space. Our method, AdaContrast, achieves state-of-the-art performance on major benchmarks.
arXiv Detail & Related papers (2022-04-21T19:17:22Z)
Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space. We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z)
Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification [27.043136219527767]
We propose a novel contrastive learning boosted multi-label prediction model. By using contrastive learning in the supervised setting, we can exploit label information effectively. We show that the learnt embeddings provide insights into the interpretation of label-label interactions.
arXiv Detail & Related papers (2021-12-02T04:23:34Z)
Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network. Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced. We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z)
Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text. These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining. We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z)
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data. We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step. Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z)
Noisy Concurrent Training for Efficient Learning under Label Noise [13.041607703862724]
Deep neural networks (DNNs) fail to learn effectively under label noise and have been shown to memorize random labels which affect their performance. We consider learning in isolation, using one-hot encoded labels as the sole source of supervision, and a lack of regularization to discourage memorization as the major shortcomings of the standard training procedure. We propose Noisy Concurrent Training (NCT) which leverages collaborative learning to use the consensus between two models as an additional source of supervision.
arXiv Detail & Related papers (2020-09-17T14:22:17Z)
Noisy Self-Knowledge Distillation for Text Summarization [83.49809205891496]
We apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training. Our student summarization model is trained with guidance from a teacher which generates smoothed labels to help regularize training. We demonstrate experimentally on three benchmarks that our framework boosts the performance of both pretrained and non-pretrained summarizers.
arXiv Detail & Related papers (2020-09-15T12:53:09Z)
Early-Learning Regularization Prevents Memorization of Noisy Labels [29.04549895470588]
We propose a novel framework to perform classification via deep learning in the presence of noisy annotations. Deep neural networks have been observed to first fit the training data with clean labels during an "early learning" phase. We design a regularization term that steers the model towards these targets, implicitly preventing memorization of the false labels.
arXiv Detail & Related papers (2020-06-30T23:46:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.