Representation Learning via Consistent Assignment of Views over Random
Partitions
- URL: http://arxiv.org/abs/2310.12692v2
- Date: Fri, 27 Oct 2023 07:28:19 GMT
- Title: Representation Learning via Consistent Assignment of Views over Random
Partitions
- Authors: Thalles Silva and Ad\'in Ram\'irez Rivera
- Abstract summary: Consistent Assignment of Views over Random Partitions (CARP) is a self-supervised clustering method for representation learning.
We evaluate CARP's representations capabilities in 17 datasets across many standard protocols, including linear evaluation, few-shot classification, k-NN, k-means, image retrieval, and copy detection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Consistent Assignment of Views over Random Partitions (CARP), a
self-supervised clustering method for representation learning of visual
features. CARP learns prototypes in an end-to-end online fashion using gradient
descent without additional non-differentiable modules to solve the cluster
assignment problem. CARP optimizes a new pretext task based on random
partitions of prototypes that regularizes the model and enforces consistency
between views' assignments. Additionally, our method improves training
stability and prevents collapsed solutions in joint-embedding training. Through
an extensive evaluation, we demonstrate that CARP's representations are
suitable for learning downstream tasks. We evaluate CARP's representations
capabilities in 17 datasets across many standard protocols, including linear
evaluation, few-shot classification, k-NN, k-means, image retrieval, and copy
detection. We compare CARP performance to 11 existing self-supervised methods.
We extensively ablate our method and demonstrate that our proposed random
partition pretext task improves the quality of the learned representations by
devising multiple random classification tasks. In transfer learning tasks, CARP
achieves the best performance on average against many SSL methods trained for a
longer time.
Related papers
- Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation [21.20806568508201]
We show how to leverage class text information to mitigate distribution drifts encountered by vision-language models (VLMs) during test-time inference.
We propose to generate pseudo-labels for the test-time samples by exploiting generic class text embeddings as fixed centroids of a label assignment problem.
Experiments on multiple popular test-time adaptation benchmarks presenting diverse complexity empirically show the superiority of CLIP-OT.
arXiv Detail & Related papers (2024-11-26T00:15:37Z) - BaFTA: Backprop-Free Test-Time Adaptation For Zero-Shot Vision-Language Models [20.88680592729709]
We propose a novel backpropagation-free algorithm BaFTA for test-time adaptation of vision-language models.
BaFTA directly estimates class centroids using online clustering within a projected embedding space.
We demonstrate that BaFTA consistently outperforms state-of-the-art test-time adaptation methods in both effectiveness and efficiency.
arXiv Detail & Related papers (2024-06-17T08:16:24Z) - Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images [76.47980643420375]
This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences.
We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision.
Our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW.
arXiv Detail & Related papers (2023-11-30T13:22:15Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP.
MP performs a linear classification head based on the mean of final features.
Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - Efficient Self-Supervision using Patch-based Contrastive Learning for
Histopathology Image Segmentation [0.456877715768796]
We propose a framework for self-supervised image segmentation using contrastive learning on image patches.
A fully convolutional neural network (FCNN) is trained in a self-supervised manner to discern features in the input images.
The proposed model only consists of a simple FCNN with 10.8k parameters and requires about 5 minutes to converge on the high resolution microscopy datasets.
arXiv Detail & Related papers (2022-08-23T07:24:47Z) - CAD: Co-Adapting Discriminative Features for Improved Few-Shot
Classification [11.894289991529496]
Few-shot classification is a challenging problem that aims to learn a model that can adapt to unseen classes given a few labeled samples.
Recent approaches pre-train a feature extractor, and then fine-tune for episodic meta-learning.
We propose a strategy to cross-attend and re-weight discriminative features for few-shot classification.
arXiv Detail & Related papers (2022-03-25T06:14:51Z) - Unsupervised Learning of Visual Features by Contrasting Cluster
Assignments [57.33699905852397]
We propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons.
Our method simultaneously clusters the data while enforcing consistency between cluster assignments.
Our method can be trained with large and small batches and can scale to unlimited amounts of data.
arXiv Detail & Related papers (2020-06-17T14:00:42Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.