Convex Clustering
- URL: http://arxiv.org/abs/2507.09077v1
- Date: Fri, 11 Jul 2025 23:38:15 GMT
- Title: Convex Clustering
- Authors: Eric C. Chi, Aaron J. Molstad, Zheming Gao,
- Abstract summary: convex clustering has several uncommon features that distinguish it from prior art.<n>Its unique global minimizer is stable with respect to all its inputs.<n>We highlight important algorithms and give insight into how their computational costs scale with the problem size.
- Score: 4.577104493960515
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This survey reviews a clustering method based on solving a convex optimization problem. Despite the plethora of existing clustering methods, convex clustering has several uncommon features that distinguish it from prior art. The optimization problem is free of spurious local minima, and its unique global minimizer is stable with respect to all its inputs, including the data, a tuning parameter, and weight hyperparameters. Its single tuning parameter controls the number of clusters and can be chosen using standard techniques from penalized regression. We give intuition into the behavior and theory for convex clustering as well as practical guidance. We highlight important algorithms and give insight into how their computational costs scale with the problem size. Finally, we highlight the breadth of its uses and flexibility to be combined and integrated with other inferential methods.
Related papers
- A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems [24.208152437317768]
We view the semi-supervised clustering task as a partitioning problem on a graph associated with the given dataset.<n>We propose a block coordinate descent algorithm to efficiently solve this model.<n>We can construct the clusters that theoretically meet the must-link constraints under mild assumptions.
arXiv Detail & Related papers (2025-03-06T14:02:28Z) - Strong bounds for large-scale Minimum Sum-of-Squares Clustering [0.9831489366502302]
Minimum Sum-of-Squares Clustering (MSSC) is one of the most widely used clustering methods.<n>MSSC aims to minimize the total squared Euclidean distance between data points and their corresponding cluster centroids.<n>We introduce a novel method to validate MSSC solutions through optimality gaps.
arXiv Detail & Related papers (2025-02-12T13:40:00Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.<n>In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.<n>This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Distributed Solution of the Inverse Rig Problem in Blendshape Facial
Animation [0.0]
Rig inversion is central in facial animation as it allows for a realistic and appealing performance of avatars.
A possible approach towards a faster solution is clustering, which exploits the spacial nature of the face.
In this paper, we go a step further, involving cluster coupling to get more confident estimates of the overlapping components.
arXiv Detail & Related papers (2023-03-11T10:34:07Z) - Neural Capacitated Clustering [6.155158115218501]
We propose a new method for the Capacitated Clustering Problem (CCP) that learns a neural network to predict the assignment probabilities of points to cluster centers.
In our experiments on artificial data and two real world datasets our approach outperforms several state-of-the-art mathematical and solvers from the literature.
arXiv Detail & Related papers (2023-02-10T09:33:44Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - Local versions of sum-of-norms clustering [77.34726150561087]
We show that our method can separate arbitrarily close balls in the ball model.
We prove a quantitative bound on the error incurred in the clustering of disjoint connected sets.
arXiv Detail & Related papers (2021-09-20T14:45:29Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Memory Clustering using Persistent Homology for Multimodality- and
Discontinuity-Sensitive Learning of Optimal Control Warm-starts [24.576214898129823]
Shooting methods are an efficient approach to solving nonlinear optimal control problems.
Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space.
In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space.
arXiv Detail & Related papers (2020-10-02T14:24:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.