Text-Guided Alternative Image Clustering
- URL: http://arxiv.org/abs/2406.18589v1
- Date: Fri, 7 Jun 2024 08:37:57 GMT
- Title: Text-Guided Alternative Image Clustering
- Authors: Andreas Stephan, Lukas Miklautz, Collin Leiber, Pedro Henrique Luz de Araujo, Dominik Répás, Claudia Plant, Benjamin Roth,
- Abstract summary: This work explores the potential of large vision-language models to facilitate alternative image clustering.
We propose Text-Guided Alternative Image Consensus Clustering (TGAICC), a novel approach that leverages user-specified interests via prompts.
TGAICC outperforms image- and text-based baselines on four alternative image clustering benchmark datasets.
- Score: 11.103514372355088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional image clustering techniques only find a single grouping within visual data. In particular, they do not provide a possibility to explicitly define multiple types of clustering. This work explores the potential of large vision-language models to facilitate alternative image clustering. We propose Text-Guided Alternative Image Consensus Clustering (TGAICC), a novel approach that leverages user-specified interests via prompts to guide the discovery of diverse clusterings. To achieve this, it generates a clustering for each prompt, groups them using hierarchical clustering, and then aggregates them using consensus clustering. TGAICC outperforms image- and text-based baselines on four alternative image clustering benchmark datasets. Furthermore, using count-based word statistics, we are able to obtain text-based explanations of the alternative clusterings. In conclusion, our research illustrates how contemporary large vision-language models can transform explanatory data analysis, enabling the generation of insightful, customizable, and diverse image clusterings.
Related papers
- Organizing Unstructured Image Collections using Natural Language [37.16101036513514]
We introduce the task Semantic Multiple Clustering (SMC) that aims to automatically discover clustering criteria from large image collections.
Our framework, Text Driven Semantic Multiple Clustering (TeDeSC), uses text as a proxy to concurrently reason over large image collections.
We apply TeDeSC to various applications, such as discovering biases and analyzing social media image popularity.
arXiv Detail & Related papers (2024-10-07T17:21:46Z) - Text-Guided Image Clustering [15.217924518131268]
We propose Text-Guided Image Clustering, i.e., generating text using image captioning and visual question-answering (VQA) models.
Across eight diverse image clustering datasets, our results show that the obtained text representations often outperform image features.
arXiv Detail & Related papers (2024-02-05T13:34:21Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Image Clustering Conditioned on Text Criteria [14.704110575570166]
We present a new method for performing image clustering based on user-specified text criteria.
We call our method Image Clustering Conditioned on Text Criteria (IC|TC)
IC|TC requires a minimal and practical degree of human intervention and grants the user significant control over the clustering results in return.
arXiv Detail & Related papers (2023-10-27T17:35:01Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - CEIL: A General Classification-Enhanced Iterative Learning Framework for
Text Clustering [16.08402937918212]
We propose a novel Classification-Enhanced Iterative Learning framework for short text clustering.
In each iteration, we first adopt a language model to retrieve the initial text representations.
After strict data filtering and aggregation processes, samples with clean category labels are retrieved, which serve as supervision information.
Finally, the updated language model with improved representation ability is used to enhance clustering in the next iteration.
arXiv Detail & Related papers (2023-04-20T14:04:31Z) - ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial
Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering.
The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Compositional Clustering: Applications to Multi-Label Object Recognition
and Speaker Identification [19.470445399577265]
We consider a novel clustering task in which clusters can have compositional relationships.
We propose three new algorithms that can partition examples into coherent groups and infer the compositional structure among them.
Our work has applications to open-world multi-label object recognition and speaker identification & diarization with simultaneous speech from multiple speakers.
arXiv Detail & Related papers (2021-09-09T10:42:14Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.