Engineering the Neural Collapse Geometry of Supervised-Contrastive Loss
- URL: http://arxiv.org/abs/2310.00893v1
- Date: Mon, 2 Oct 2023 04:23:17 GMT
- Title: Engineering the Neural Collapse Geometry of Supervised-Contrastive Loss
- Authors: Jaidev Gill, Vala Vakilian, Christos Thrampoulidis
- Abstract summary: Supervised-contrastive loss (SCL) is an alternative to cross-entropy (CE) for classification tasks.
We propose methods to engineer the geometry of learnt feature embeddings by modifying the contrastive loss.
- Score: 28.529476019629097
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Supervised-contrastive loss (SCL) is an alternative to cross-entropy (CE) for
classification tasks that makes use of similarities in the embedding space to
allow for richer representations. In this work, we propose methods to engineer
the geometry of these learnt feature embeddings by modifying the contrastive
loss. In pursuit of adjusting the geometry we explore the impact of prototypes,
fixed embeddings included during training to alter the final feature geometry.
Specifically, through empirical findings, we demonstrate that the inclusion of
prototypes in every batch induces the geometry of the learnt embeddings to
align with that of the prototypes. We gain further insights by considering a
limiting scenario where the number of prototypes far outnumber the original
batch size. Through this, we establish a connection to cross-entropy (CE) loss
with a fixed classifier and normalized embeddings. We validate our findings by
conducting a series of experiments with deep neural networks on benchmark
vision datasets.
Related papers
- SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - 3D Geometric Shape Assembly via Efficient Point Cloud Matching [59.241448711254485]
We introduce Proxy Match Transform (PMT), an approximate high-order feature transform layer that enables reliable matching between mating surfaces of parts.
Building upon PMT, we introduce a new framework, dubbed Proxy Match TransformeR (PMTR), for the geometric assembly task.
We evaluate the proposed PMTR on the large-scale 3D geometric shape assembly benchmark dataset of Breaking Bad.
arXiv Detail & Related papers (2024-07-15T08:50:02Z) - PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers [7.4774909520731425]
We show that pre-trained transformer-based vision models, such as self-supervised DINOv2 ViT, enable the relaxation of constraints.
In particular, we find that a total variation (TV) prior, which allows for multiple connected components of any size, substantially outperforms previous work.
arXiv Detail & Related papers (2024-07-05T14:24:37Z) - Split-and-Fit: Learning B-Reps via Structure-Aware Voronoi Partitioning [50.684254969269546]
We introduce a novel method for acquiring boundary representations (B-Reps) of 3D CAD models.
We apply a spatial partitioning to derive a single primitive within each partition.
We show that our network, coined NVD-Net for neural Voronoi diagrams, can effectively learn Voronoi partitions for CAD models from training data.
arXiv Detail & Related papers (2024-06-07T21:07:49Z) - Coded Residual Transform for Generalizable Deep Metric Learning [34.100840501900706]
We introduce a new method called coded residual transform (CRT) for deep metric learning to significantly improve its generalization capability.
CRT represents and encodes the feature map from a set of complimentary perspectives based on projections onto diversified prototypes.
Our experimental results and ablation studies demonstrate that the proposed CRT method outperform the state-of-the-art deep metric learning methods by large margins.
arXiv Detail & Related papers (2022-10-09T06:17:31Z) - 3D Textured Shape Recovery with Learned Geometric Priors [58.27543892680264]
This technical report presents our approach to address limitations by incorporating learned geometric priors.
We generate a SMPL model from learned pose prediction and fuse it into the partial input to add prior knowledge of human bodies.
We also propose a novel completeness-aware bounding box adaptation for handling different levels of scales.
arXiv Detail & Related papers (2022-09-07T16:03:35Z) - Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders [45.29194877564103]
This work introduces a chart autoencoder with an asymmetric encoding-decoding process that can incorporate additional semi-supervised information such as class labels.
We discuss the approximation power of such networks and derive a bound that essentially depends on the intrinsic dimension of the data manifold rather than the dimension of ambient space.
arXiv Detail & Related papers (2022-08-22T19:58:03Z) - Curved Geometric Networks for Visual Anomaly Recognition [39.91252195360767]
Learning a latent embedding to understand the underlying nature of data distribution is often formulated in Euclidean spaces with zero curvature.
In this work, we investigate benefits of the curved space for analyzing anomalies or out-of-distribution objects in data.
arXiv Detail & Related papers (2022-08-02T01:15:39Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Surface Vision Transformers: Attention-Based Modelling applied to
Cortical Analysis [8.20832544370228]
We introduce a domain-agnostic architecture to study any surface data projected onto a spherical manifold.
A vision transformer model encodes the sequence of patches via successive multi-head self-attention layers.
Experiments show that the SiT generally outperforms surface CNNs, while performing comparably on registered and unregistered data.
arXiv Detail & Related papers (2022-03-30T15:56:11Z) - Test-time Adaptation with Slot-Centric Models [63.981055778098444]
Slot-TTA is a semi-supervised scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives.
We show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods.
arXiv Detail & Related papers (2022-03-21T17:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.