Deep Transformation-Invariant Clustering
- URL: http://arxiv.org/abs/2006.11132v2
- Date: Tue, 27 Oct 2020 18:08:13 GMT
- Title: Deep Transformation-Invariant Clustering
- Authors: Tom Monnier, Thibault Groueix, Mathieu Aubry
- Abstract summary: We present an approach that does not rely on abstract features but instead learns to predict image transformations.
This learning process naturally fits in the gradient-based training of K-means and Gaussian mixture model.
We demonstrate that our novel approach yields competitive and highly promising results on standard image clustering benchmarks.
- Score: 24.23117820167443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in image clustering typically focus on learning better deep
representations. In contrast, we present an orthogonal approach that does not
rely on abstract features but instead learns to predict image transformations
and performs clustering directly in image space. This learning process
naturally fits in the gradient-based training of K-means and Gaussian mixture
model, without requiring any additional loss or hyper-parameters. It leads us
to two new deep transformation-invariant clustering frameworks, which jointly
learn prototypes and transformations. More specifically, we use deep learning
modules that enable us to resolve invariance to spatial, color and
morphological transformations. Our approach is conceptually simple and comes
with several advantages, including the possibility to easily adapt the desired
invariance to the task and a strong interpretability of both cluster centers
and assignments to clusters. We demonstrate that our novel approach yields
competitive and highly promising results on standard image clustering
benchmarks. Finally, we showcase its robustness and the advantages of its
improved interpretability by visualizing clustering results over real
photograph collections.
Related papers
- A Spitting Image: Modular Superpixel Tokenization in Vision Transformers [0.0]
Vision Transformer (ViT) architectures traditionally employ a grid-based approach to tokenization independent of the semantic content of an image.
We propose a modular superpixel tokenization strategy which decouples tokenization and feature extraction.
arXiv Detail & Related papers (2024-08-14T17:28:58Z) - Superpixel Graph Contrastive Clustering with Semantic-Invariant
Augmentations for Hyperspectral Images [64.72242126879503]
Hyperspectral images (HSI) clustering is an important but challenging task.
We first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spatial and spectral features of HSI.
We then design a superpixel graph contrastive clustering model to learn discriminative superpixel representations.
arXiv Detail & Related papers (2024-03-04T07:40:55Z) - Grid Jigsaw Representation with CLIP: A New Perspective on Image
Clustering [37.15595383168132]
Jigsaw based strategy method for image clustering called Grid Jigsaw Representation (GJR) with systematic exposition from pixel to feature in discrepancy against human and computer.
GJR modules are appended to a variety of deep convolutional networks and tested with significant improvements on a wide range of benchmark datasets.
Experiment results show the effectiveness on the clustering task with respect to the ACC, NMI and ARI three metrics and super fast convergence speed.
arXiv Detail & Related papers (2023-10-27T03:07:05Z) - Motion Estimation for Large Displacements and Deformations [7.99536002595393]
Variational optical flow techniques based on a coarse-to-fine scheme interpolate sparse matches and locally optimize an energy model conditioned on colour, gradient and smoothness.
This paper addresses this problem and presents HybridFlow, a variational motion estimation framework for large displacements and deformations.
arXiv Detail & Related papers (2022-06-24T18:53:22Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Clustering by Maximizing Mutual Information Across Views [62.21716612888669]
We propose a novel framework for image clustering that incorporates joint representation learning and clustering.
Our method significantly outperforms state-of-the-art single-stage clustering methods across a variety of image datasets.
arXiv Detail & Related papers (2021-07-24T15:36:49Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z) - Interpretable Image Clustering via Diffeomorphism-Aware K-Means [20.747301413801843]
We develop a measure of similarity between images and centroids that encompasses a general class of deformations: diffeomorphisms.
We show that our approach competes with state-of-the-art methods on various datasets.
arXiv Detail & Related papers (2020-12-16T16:11:39Z) - Scattering Transform Based Image Clustering using Projection onto
Orthogonal Complement [2.0305676256390934]
We introduce Projected-Scattering Spectral Clustering (PSSC), a state-of-the-art, stable, and fast algorithm for image clustering.
PSSC includes a novel method to exploit the geometric structure of the scattering transform of small images.
Our experiments demonstrate that PSSC obtains the best results among all shallow clustering algorithms.
arXiv Detail & Related papers (2020-11-23T17:59:03Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.