Cluster-aware Semi-supervised Learning: Relational Knowledge
Distillation Provably Learns Clustering
- URL: http://arxiv.org/abs/2307.11030v2
- Date: Mon, 23 Oct 2023 23:04:43 GMT
- Title: Cluster-aware Semi-supervised Learning: Relational Knowledge
Distillation Provably Learns Clustering
- Authors: Yijun Dong, Kevin Miller, Qi Lei, Rachel Ward
- Abstract summary: We take an initial step toward a theoretical understanding of relational knowledge distillation (RKD)
For semi-supervised learning, we demonstrate the label efficiency of RKD through a general framework of cluster-aware learning.
We show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective.
- Score: 15.678104431835772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the empirical success and practical significance of (relational)
knowledge distillation that matches (the relations of) features between teacher
and student models, the corresponding theoretical interpretations remain
limited for various knowledge distillation paradigms. In this work, we take an
initial step toward a theoretical understanding of relational knowledge
distillation (RKD), with a focus on semi-supervised classification problems. We
start by casting RKD as spectral clustering on a population-induced graph
unveiled by a teacher model. Via a notion of clustering error that quantifies
the discrepancy between the predicted and ground truth clusterings, we
illustrate that RKD over the population provably leads to low clustering error.
Moreover, we provide a sample complexity bound for RKD with limited unlabeled
samples. For semi-supervised learning, we further demonstrate the label
efficiency of RKD through a general framework of cluster-aware semi-supervised
learning that assumes low clustering errors. Finally, by unifying data
augmentation consistency regularization into this cluster-aware framework, we
show that despite the common effect of learning accurate clusterings, RKD
facilitates a "global" perspective through spectral clustering, whereas
consistency regularization focuses on a "local" perspective via expansion.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Fairness in Visual Clustering: A Novel Transformer Clustering Approach [32.806921406869996]
We first evaluate demographic bias in deep clustering models from the perspective of cluster purity.
A novel loss function is introduced to encourage a purity consistency for all clusters to maintain the fairness aspect.
We present a novel attention mechanism, Cross-attention, to measure correlations between multiple clusters.
arXiv Detail & Related papers (2023-04-14T21:59:32Z) - Cluster-aware Contrastive Learning for Unsupervised Out-of-distribution
Detection [0.0]
Unsupervised out-of-distribution (OOD) Detection aims to separate the samples falling outside the distribution of training data without label information.
We propose Cluster-aware Contrastive Learning (CCL) framework for unsupervised OOD detection, which considers both instance-level and semantic-level information.
arXiv Detail & Related papers (2023-02-06T07:21:03Z) - Modeling Multiple Views via Implicitly Preserving Global Consistency and
Local Complementarity [61.05259660910437]
We propose a global consistency and complementarity network (CoCoNet) to learn representations from multiple views.
On the global stage, we reckon that the crucial knowledge is implicitly shared among views, and enhancing the encoder to capture such knowledge can improve the discriminability of the learned representations.
Lastly on the local stage, we propose a complementarity-factor, which joints cross-view discriminative knowledge, and it guides the encoders to learn not only view-wise discriminability but also cross-view complementary information.
arXiv Detail & Related papers (2022-09-16T09:24:00Z) - Hybrid Dynamic Contrast and Probability Distillation for Unsupervised
Person Re-Id [109.1730454118532]
Unsupervised person re-identification (Re-Id) has attracted increasing attention due to its practical application in the read-world video surveillance system.
We present the hybrid dynamic cluster contrast and probability distillation algorithm.
It formulates the unsupervised Re-Id problem into an unified local-to-global dynamic contrastive learning and self-supervised probability distillation framework.
arXiv Detail & Related papers (2021-09-29T02:56:45Z) - Complementary Calibration: Boosting General Continual Learning with
Collaborative Distillation and Self-Supervision [47.374412281270594]
General Continual Learning (GCL) aims at learning from non independent and identically distributed stream data.
We reveal that the relation and feature deviations are crucial problems for catastrophic forgetting.
We propose a Complementary (CoCa) framework by mining the complementary model's outputs and features.
arXiv Detail & Related papers (2021-09-03T06:35:27Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL)
It explores the characteristics of both conventional contrastive learning and deep clustering.
It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z) - Robust Unsupervised Learning via L-Statistic Minimization [38.49191945141759]
We present a general approach to this problem focusing on unsupervised learning.
The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models.
We prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning.
arXiv Detail & Related papers (2020-12-14T10:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.