CrOC: Cross-View Online Clustering for Dense Visual Representation
Learning
- URL: http://arxiv.org/abs/2303.13245v1
- Date: Thu, 23 Mar 2023 13:24:16 GMT
- Title: CrOC: Cross-View Online Clustering for Dense Visual Representation
Learning
- Authors: Thomas Stegm\"uller, Tim Lebailly, Behzad Bozorgtabar, Tinne
Tuytelaars, Jean-Philippe Thiran
- Abstract summary: We propose a Cross-view consistency objective with an Online Clustering mechanism (CrOC) to discover and segment the semantics of the views.
In the absence of hand-crafted priors, the resulting method is more generalizable and does not require a cumbersome pre-processing step.
We demonstrate excellent performance on linear and unsupervised segmentation transfer tasks on various datasets.
- Score: 39.12950211289954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning dense visual representations without labels is an arduous task and
more so from scene-centric data. We propose to tackle this challenging problem
by proposing a Cross-view consistency objective with an Online Clustering
mechanism (CrOC) to discover and segment the semantics of the views. In the
absence of hand-crafted priors, the resulting method is more generalizable and
does not require a cumbersome pre-processing step. More importantly, the
clustering algorithm conjointly operates on the features of both views, thereby
elegantly bypassing the issue of content not represented in both views and the
ambiguous matching of objects from one crop to the other. We demonstrate
excellent performance on linear and unsupervised segmentation transfer tasks on
various datasets and similarly for video object segmentation. Our code and
pre-trained models are publicly available at https://github.com/stegmuel/CrOC.
Related papers
- Deep Multi-View Subspace Clustering with Anchor Graph [11.291831842959926]
We propose a novel deep multi-view subspace clustering method with anchor graph (DMCAG)
DMCAG learns the embedded features for each view independently, which are used to obtain the subspace representations.
Our method achieves superior clustering performance over other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-11T16:17:43Z) - Unified Mask Embedding and Correspondence Learning for Self-Supervised
Video Segmentation [76.40565872257709]
We develop a unified framework which simultaneously models cross-frame dense correspondence for locally discriminative feature learning.
It is able to directly learn to perform mask-guided sequential segmentation from unlabeled videos.
Our algorithm sets state-of-the-arts on two standard benchmarks (i.e., DAVIS17 and YouTube-VOS)
arXiv Detail & Related papers (2023-03-17T16:23:36Z) - DeepCut: Unsupervised Segmentation using Graph Neural Networks
Clustering [6.447863458841379]
This study introduces a lightweight Graph Neural Network (GNN) to replace classical clustering methods.
Unlike existing methods, our GNN takes both the pair-wise affinities between local image features and the raw features as input.
We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training an image segmentation GNN.
arXiv Detail & Related papers (2022-12-12T12:31:46Z) - GOCA: Guided Online Cluster Assignment for Self-Supervised Video
Representation Learning [49.69279760597111]
Clustering is a ubiquitous tool in unsupervised learning.
Most of the existing self-supervised representation learning methods typically cluster samples based on visually dominant features.
We propose a principled way to combine two views. Specifically, we propose a novel clustering strategy where we use the initial cluster assignment of each view as prior to guide the final cluster assignment of the other view.
arXiv Detail & Related papers (2022-07-20T19:26:55Z) - Unsupervised Visual Representation Learning by Online Constrained
K-Means [44.38989920488318]
Cluster discrimination is an effective pretext task for unsupervised representation learning.
We propose a novel clustering-based pretext task with online textbfConstrained textbfK-mtextbfeans (textbfCoKe)
Our online assignment method has a theoretical guarantee to approach the global optimum.
arXiv Detail & Related papers (2021-05-24T20:38:32Z) - Temporally-Weighted Hierarchical Clustering for Unsupervised Action
Segmentation [96.67525775629444]
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos.
We present a fully automatic and unsupervised approach for segmenting actions in a video that does not require any training.
Our proposal is an effective temporally-weighted hierarchical clustering algorithm that can group semantically consistent frames of the video.
arXiv Detail & Related papers (2021-03-20T23:30:01Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - Learning to Cluster Faces via Confidence and Connectivity Estimation [136.5291151775236]
We propose a fully learnable clustering framework without requiring a large number of overlapped subgraphs.
Our method significantly improves clustering accuracy and thus performance of the recognition models trained on top, yet it is an order of magnitude more efficient than existing supervised methods.
arXiv Detail & Related papers (2020-04-01T13:39:37Z) - GATCluster: Self-Supervised Gaussian-Attention Network for Image
Clustering [9.722607434532883]
We propose a self-supervised clustering network for image Clustering (GATCluster)
Rather than extracting intermediate features first and then performing the traditional clustering, GATCluster semantic cluster labels without further post-processing.
We develop a two-step learning algorithm that is memory-efficient for clustering large-size images.
arXiv Detail & Related papers (2020-02-27T00:57:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.