ESMC: MLLM-Based Embedding Selection for Explainable Multiple Clustering
- URL: http://arxiv.org/abs/2512.00725v1
- Date: Sun, 30 Nov 2025 04:36:51 GMT
- Title: ESMC: MLLM-Based Embedding Selection for Explainable Multiple Clustering
- Authors: Xinyue Wang, Yuheng Jia, Hui Liu, Junhui Hou,
- Abstract summary: Multi-modal large language models (MLLMs) can be leveraged to achieve user-driven clustering.<n>Our method first discovers that MLLMs' hidden states of text tokens are strongly related to the corresponding features.<n>We also employ a lightweight clustering head augmented with pseudo-label learning, significantly enhancing clustering accuracy.
- Score: 79.69917150582633
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Typical deep clustering methods, while achieving notable progress, can only provide one clustering result per dataset. This limitation arises from their assumption of a fixed underlying data distribution, which may fail to meet user needs and provide unsatisfactory clustering outcomes. Our work investigates how multi-modal large language models (MLLMs) can be leveraged to achieve user-driven clustering, emphasizing their adaptability to user-specified semantic requirements. However, directly using MLLM output for clustering has risks for producing unstructured and generic image descriptions instead of feature-specific and concrete ones. To address these issues, our method first discovers that MLLMs' hidden states of text tokens are strongly related to the corresponding features, and leverages these embeddings to perform clusterings from any user-defined criteria. We also employ a lightweight clustering head augmented with pseudo-label learning, significantly enhancing clustering accuracy. Extensive experiments demonstrate its competitive performance on diverse datasets and metrics.
Related papers
- ClusterFusion: Hybrid Clustering with Embedding Guidance and LLM Adaptation [52.794544682493814]
Large language models (LLMs) provide strong contextual reasoning, yet prior work mainly uses them as auxiliary modules to refine embeddings or adjust cluster boundaries.<n>We propose ClusterFusion, a hybrid framework that treats the LLM as the clustering core, guided by lightweight embedding methods.<n> Experiments on three public benchmarks and two new domain-specific datasets demonstrate that ClusterFusion achieves state-of-the-art performance on standard tasks.
arXiv Detail & Related papers (2025-12-04T00:49:43Z) - LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering [52.41664454251679]
Large Language Models (LLMs) are reshaping unsupervised learning by offering an unprecedented ability to perform text clustering.<n>Existing methods often rely on complex pipelines with external modules, sacrificing a truly end-to-end approach.<n>We introduce LLM-MemCluster, a novel framework that reconceptualizes clustering as a fully LLM-native task.
arXiv Detail & Related papers (2025-11-19T13:22:08Z) - In-Context Clustering with Large Language Models [50.25868718329313]
ICC captures complex relationships among inputs through an attention mechanism.<n>We show that pretrained LLMs exhibit impressive zero-shot clustering capabilities on text-encoded numeric data.<n>Our work extends in-context learning to an unsupervised setting, showcasing the effectiveness and flexibility of LLMs for clustering.
arXiv Detail & Related papers (2025-10-09T17:07:55Z) - Agent-Centric Personalized Multiple Clustering with Multi-Modal LLMs [40.38930402847949]
We propose an agent-centric personalized clustering framework.<n>Agents traverse a relational graph to search for clusters based on user interests.<n>Results show that the proposed method achieves NMI scores of 0.9667 and 0.9481 on the Card Order and Card Suits benchmarks.
arXiv Detail & Related papers (2025-03-28T08:45:15Z) - Personalized Clustering via Targeted Representation Learning [12.685373069492448]
Clustering traditionally aims to reveal a natural grouping structure within unlabeled data.<n>We propose a personalized clustering method that explicitly performs targeted representation learning.
arXiv Detail & Related papers (2024-12-18T10:28:51Z) - Text Clustering as Classification with LLMs [9.128151647718251]
We propose a novel framework that reframes text clustering as a classification task by harnessing the in-context learning capabilities of Large Language Models.<n>By leveraging the advanced natural language understanding and generalization capabilities of LLMs, the proposed approach enables effective clustering with minimal human intervention.<n> Experimental results on diverse datasets demonstrate that our framework achieves comparable or superior performance to state-of-the-art embedding-based clustering techniques.
arXiv Detail & Related papers (2024-09-30T16:57:34Z) - Large Language Models Enable Few-Shot Clustering [88.06276828752553]
We show that large language models can amplify an expert's guidance to enable query-efficient, few-shot semi-supervised text clustering.
We find incorporating LLMs in the first two stages can routinely provide significant improvements in cluster quality.
arXiv Detail & Related papers (2023-07-02T09:17:11Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.