Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning
- URL: http://arxiv.org/abs/2510.20108v1
- Date: Thu, 23 Oct 2025 01:25:10 GMT
- Title: Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning
- Authors: Gabriel Y. Arteaga, Marius Aasan, Rwiddhi Chakraborty, Martine Hjelkrem-Tan, Thalles Silva, Michael Kampffmeyer, Adín Ramírez Rivera,
- Abstract summary: Prototypical self-supervised learning methods consistently suffer from partial prototype collapse.<n>This undermines their central purpose -- providing diverse and informative targets to guide encoders toward rich representations.<n>We introduce a fully decoupled training strategy that learns prototypes and encoders under separate objectives.
- Score: 15.258418184220803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prototypical self-supervised learning methods consistently suffer from partial prototype collapse, where multiple prototypes converge to nearly identical representations. This undermines their central purpose -- providing diverse and informative targets to guide encoders toward rich representations -- and has led practitioners to over-parameterize prototype sets or add ad-hoc regularizers, which mitigate symptoms rather than address the root cause. We empirically trace the collapse to the joint optimization of encoders and prototypes, which encourages a type of shortcut learning: early in training prototypes drift toward redundant representations that minimize loss without necessarily enhancing representation diversity. To break the joint optimization, we introduce a fully decoupled training strategy that learns prototypes and encoders under separate objectives. Concretely, we model prototypes as a Gaussian mixture updated with an online EM-style procedure, independent of the encoder's loss. This simple yet principled decoupling eliminates prototype collapse without explicit regularization and yields consistently diverse prototypes and stronger downstream performance.
Related papers
- Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation [66.82598255715696]
Federated learning enables multiple medical institutions to train a global model without sharing data.<n>Current approaches primarily focus on final-layer features, overlooking critical multi-level cues.<n>We propose FedBCS to bridge feature representation gaps via domain-invariant contextual prototypes alignment.
arXiv Detail & Related papers (2025-11-14T04:15:34Z) - Proto-Former: Unified Facial Landmark Detection by Prototype Transformer [77.47431726595111]
Proto-Former is a unified, adaptive, end-to-end facial landmark detection framework.<n>It enables joint training across multiple datasets within a unified architecture.<n>Proto-Former achieves superior performance compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2025-10-17T06:00:25Z) - DNP-Guided Contrastive Reconstruction with a Reverse Distillation Transformer for Medical Anomaly Detection [1.0924595442390774]
Anomaly detection in medical images is challenging due to limited annotations and a domain gap compared to natural images.<n>Existing reconstruction methods often rely on frozen pre-trained encoders, which limits adaptation to domain-specific features.<n>We propose a unified framework combining a trainable encoder with prototype-guided reconstruction and a novel Diversity-Aware Alignment Loss.
arXiv Detail & Related papers (2025-08-27T05:12:09Z) - Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z) - Pro-AD: Learning Comprehensive Prototypes with Prototype-based Constraint for Multi-class Unsupervised Anomaly Detection [8.358250148845572]
Prototype-based reconstruction methods for unsupervised anomaly detection utilize a limited set of learnable prototypes.<n>We propose Pro-AD to address these issues and fully utilize the prototypes to boost the performance of anomaly detection.<n>Our Pro-AD achieve state-of-the-art performance, highlighting its superior robustness and practical effectiveness for Multi-class Unsupervised Anomaly Detection task.
arXiv Detail & Related papers (2025-06-16T05:04:12Z) - Efficient Prototype Consistency Learning in Medical Image Segmentation via Joint Uncertainty and Data Augmentation [32.47805202531351]
Prototype learning has emerged in semi-supervised medical image segmentation.<n>We propose an efficient prototype consistency learning via joint uncertainty quantification and data augmentation.<n>Our framework is superior to previous state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-22T06:25:32Z) - On Partial Prototype Collapse in the DINO Family of Self-Supervised Methods [15.524425102344784]
Learning to map the data samples to compact representations leads to the representation collapse problem.
Regularizing the distribution of data points over the clusters is the prevalent strategy to avoid this issue.
We show that a partial prototype collapse problem still exists in the DINO family of methods, that leads to significant redundancies in the prototypes.
arXiv Detail & Related papers (2024-10-17T22:06:34Z) - Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction [57.16121098944589]
RDA is a pioneering approach designed to address two primary deficiencies prevalent in previous endeavors aiming at stealing pre-trained encoders.
It is accomplished via a sample-wise prototype, which consolidates the target encoder's representations for a given sample's various perspectives.
For more potent efficacy, we develop a multi-relational extraction loss that trains the surrogate encoder to Discriminate mismatched embedding-prototype pairs.
arXiv Detail & Related papers (2023-12-01T15:03:29Z) - Few-Shot Segmentation via Rich Prototype Generation and Recurrent
Prediction Enhancement [12.614578133091168]
We propose a rich prototype generation module (RPGM) and a recurrent prediction enhancement module (RPEM) to reinforce the prototype learning paradigm.
RPGM combines superpixel and K-means clustering to generate rich prototype features with complementary scale relationships.
RPEM utilizes the recurrent mechanism to design a round-way propagation decoder.
arXiv Detail & Related papers (2022-10-03T08:46:52Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.