Partial Label Clustering
- URL: http://arxiv.org/abs/2505.03207v1
- Date: Tue, 06 May 2025 05:43:55 GMT
- Title: Partial Label Clustering
- Authors: Yutong Xie, Fuchao Yang, Yuheng Jia,
- Abstract summary: Partial label learning (PLL) is a significant weakly supervised learning framework.<n>This paper investigates the partial label problem, which takes advantage of the limited available partial labels to improve the clustering performance.
- Score: 26.94926680877357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Partial label learning (PLL) is a significant weakly supervised learning framework, where each training example corresponds to a set of candidate labels and only one label is the ground-truth label. For the first time, this paper investigates the partial label clustering problem, which takes advantage of the limited available partial labels to improve the clustering performance. Specifically, we first construct a weight matrix of examples based on their relationships in the feature space and disambiguate the candidate labels to estimate the ground-truth label based on the weight matrix. Then, we construct a set of must-link and cannot-link constraints based on the disambiguation results. Moreover, we propagate the initial must-link and cannot-link constraints based on an adversarial prior promoted dual-graph learning approach. Finally, we integrate weight matrix construction, label disambiguation, and pairwise constraints propagation into a joint model to achieve mutual enhancement. We also theoretically prove that a better disambiguated label matrix can help improve clustering performance. Comprehensive experiments demonstrate our method realizes superior performance when comparing with state-of-the-art constrained clustering methods, and outperforms PLL and semi-supervised PLL methods when only limited samples are annotated. The code is publicly available at https://github.com/xyt-ml/PLC.
Related papers
- Mixed Blessing: Class-Wise Embedding guided Instance-Dependent Partial Label Learning [53.64180787439527]
In partial label learning (PLL), every sample is associated with a candidate label set comprising the ground-truth label and several noisy labels.<n>For the first time, we create class-wise embeddings for each sample, which allow us to explore the relationship of instance-dependent noisy labels.<n>To reduce the high label ambiguity, we introduce the concept of class prototypes containing global feature information.
arXiv Detail & Related papers (2024-12-06T13:25:39Z) - Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.<n>Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.<n>We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning [8.387189407144403]
Partial label learning (PLL) is a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels (partial label)
NPLL relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem.
We present a minimalistic framework that initially assigns pseudo-labels to images by exploiting the noisy partial labels through a weighted nearest neighbour algorithm.
arXiv Detail & Related papers (2024-02-07T13:32:47Z) - Learning Label Hierarchy with Supervised Contrastive Learning [8.488965459026678]
Supervised contrastive learning (SCL) frameworks treat each class as independent and thus consider all classes to be equally important.
This paper introduces a family of Label-Aware SCL methods (LASCL) that incorporates hierarchical information to SCL by leveraging similarities between classes.
Experiments on three datasets show that the proposed LASCL works well on text classification of distinguishing a single label among multi-labels.
arXiv Detail & Related papers (2024-01-31T23:21:40Z) - Appeal: Allow Mislabeled Samples the Chance to be Rectified in Partial Label Learning [55.4510979153023]
In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth.
To help these mislabeled samples "appeal," we propose the first appeal-based framework.
arXiv Detail & Related papers (2023-12-18T09:09:52Z) - Substituting Data Annotation with Balanced Updates and Collective Loss
in Multi-label Text Classification [19.592985329023733]
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text.
We study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels.
Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph.
arXiv Detail & Related papers (2023-09-24T04:12:52Z) - Complementary Classifier Induced Partial Label Learning [54.61668156386079]
In partial label learning (PLL), each training sample is associated with a set of candidate labels, among which only one is valid.
In disambiguation, the existing works usually do not fully investigate the effectiveness of the non-candidate label set.
In this paper, we use the non-candidate labels to induce a complementary classifier, which naturally forms an adversarial relationship against the traditional classifier.
arXiv Detail & Related papers (2023-05-17T02:13:23Z) - Complementary to Multiple Labels: A Correlation-Aware Correction
Approach [65.59584909436259]
We show theoretically how the estimated transition matrix in multi-class CLL could be distorted in multi-labeled cases.
We propose a two-step method to estimate the transition matrix from candidate labels.
arXiv Detail & Related papers (2023-02-25T04:48:48Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - SoLar: Sinkhorn Label Refinery for Imbalanced Partial-Label Learning [31.535219018410707]
Partial-label learning (PLL) is a peculiar weakly-supervised learning task where the training samples are generally associated with a set of candidate labels instead of single ground truth.
We propose SoLar, a novel framework that allows refine the disambiguated labels towards matching the marginal class prior distribution.
SoLar exhibits substantially superior results on standardized benchmarks compared to the previous state-the-art methods.
arXiv Detail & Related papers (2022-09-21T14:00:16Z) - Multi-label Classification with Partial Annotations using Class-aware
Selective Loss [14.3159150577502]
Large-scale multi-label classification datasets are commonly partially annotated.
We analyze the partial labeling problem, then propose a solution based on two key ideas.
With our novel approach, we achieve state-of-the-art results on OpenImages dataset.
arXiv Detail & Related papers (2021-10-21T08:10:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.