Self-supervised Latent Space Optimization with Nebula Variational Coding
- URL: http://arxiv.org/abs/2506.01414v1
- Date: Mon, 02 Jun 2025 08:13:32 GMT
- Title: Self-supervised Latent Space Optimization with Nebula Variational Coding
- Authors: Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari,
- Abstract summary: This paper proposes a variational inference model which leads to a clustered embedding.<n>We introduce additional variables in the latent space, called textbfnebula anchors, that guide the latent variables to form clusters during training.<n>Since each latent feature can be labeled with the closest anchor, we also propose to apply metric learning in a self-supervised way to make the separation between clusters more explicit.
- Score: 87.20343320266215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning approaches process data in a layer-by-layer way with intermediate (or latent) features. We aim at designing a general solution to optimize the latent manifolds to improve the performance on classification, segmentation, completion and/or reconstruction through probabilistic models. This paper proposes a variational inference model which leads to a clustered embedding. We introduce additional variables in the latent space, called \textbf{nebula anchors}, that guide the latent variables to form clusters during training. To prevent the anchors from clustering among themselves, we employ the variational constraint that enforces the latent features within an anchor to form a Gaussian distribution, resulting in a generative model we refer as Nebula Variational Coding (NVC). Since each latent feature can be labeled with the closest anchor, we also propose to apply metric learning in a self-supervised way to make the separation between clusters more explicit. As a consequence, the latent variables of our variational coder form clusters which adapt to the generated semantic of the training data, \textit{e.g.} the categorical labels of each sample. We demonstrate experimentally that it can be used within different architectures designed to solve different problems including text sequence, images, 3D point clouds and volumetric data, validating the advantage of our proposed method.
Related papers
- Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Variable Clustering via Distributionally Robust Nodewise Regression [7.289979396903827]
We study a multi-factor block model for variable clustering and connect it to the regularized subspace clustering by formulating a distributionally robust version of the nodewise regression.
We derive a convex relaxation, provide guidance on selecting the size of the robust region, and hence the regularization weighting parameter, based on the data, and propose an ADMM algorithm for implementation.
arXiv Detail & Related papers (2022-12-15T16:23:25Z) - Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions [16.876111500144667]
We introduce a novel probabilistic clustering method, referred to as k-sBetas.
We provide a general maximum a posteriori (MAP) perspective of clustering distributions.
Our code and comparisons with the existing simplex-clustering approaches and our introduced softmax-prediction benchmarks are publicly available.
arXiv Detail & Related papers (2022-07-30T18:29:11Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Multi-level Latent Space Structuring for Generative Control [53.240701050423155]
We propose to leverage the StyleGAN generative architecture to devise a new truncation technique.
We do so by learning to re-generate W-space, the extended intermediate latent space of StyleGAN, using a learnable mixture of Gaussians.
The resulting truncation scheme is more faithful to the original untruncated samples and allows a better trade-off between quality and diversity.
arXiv Detail & Related papers (2022-02-11T21:26:17Z) - Multi-Facet Clustering Variational Autoencoders [9.150555507030083]
High-dimensional data, such as images, typically feature multiple interesting characteristics one could cluster over.
We introduce Multi-Facet Clustering Variational Autoencoders (MFCVAE)
MFCVAE learns multiple clusterings simultaneously, and is trained fully unsupervised and end-to-end.
arXiv Detail & Related papers (2021-06-09T17:36:38Z) - Semi-Supervised Disentanglement of Class-Related and Class-Independent
Factors in VAE [4.533408938245526]
We propose a framework capable of disentangling class-related and class-independent factors of variation in data.
Our framework employs an attention mechanism in its latent space in order to improve the process of extracting class-related factors from data.
Experiments show that our framework disentangles class-related and class-independent factors of variation and learns interpretable features.
arXiv Detail & Related papers (2021-02-01T15:05:24Z) - Cluster-level Feature Alignment for Person Re-identification [16.01713931617725]
This paper probes another feature alignment modality, namely cluster-level feature alignment across whole dataset.
We propose anchor loss and investigate many variants of cluster-level feature alignment, which consists of iterative aggregation and alignment from overview of dataset.
arXiv Detail & Related papers (2020-08-15T23:47:47Z) - Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network.
We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - BasisVAE: Translation-invariant feature-level clustering with
Variational Autoencoders [9.51828574518325]
Variational Autoencoders (VAEs) provide a flexible and scalable framework for non-linear dimensionality reduction.
We show how a collapsed variational inference scheme leads to scalable and efficient inference for BasisVAE.
arXiv Detail & Related papers (2020-03-06T23:10:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.