An Efficient Smoothing Proximal Gradient Algorithm for Convex Clustering
- URL: http://arxiv.org/abs/2006.12592v1
- Date: Mon, 22 Jun 2020 20:02:59 GMT
- Title: An Efficient Smoothing Proximal Gradient Algorithm for Convex Clustering
- Authors: Xin Zhou, Chunlei Du, and Xiaodong Cai
- Abstract summary: Recently introduced convex clustering approach formulates clustering as a convex optimization problem.
State-of-the-art convex clustering algorithms require large computation and memory space.
In this paper, we develop a very efficient smoothing gradient algorithm (Sproga) for convex clustering.
- Score: 2.5182813818441945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cluster analysis organizes data into sensible groupings and is one of
fundamental modes of understanding and learning. The widely used K-means and
hierarchical clustering methods can be dramatically suboptimal due to local
minima. Recently introduced convex clustering approach formulates clustering as
a convex optimization problem and ensures a globally optimal solution. However,
the state-of-the-art convex clustering algorithms, based on the alternating
direction method of multipliers (ADMM) or the alternating minimization
algorithm (AMA), require large computation and memory space, which limits their
applications. In this paper, we develop a very efficient smoothing proximal
gradient algorithm (Sproga) for convex clustering. Our Sproga is faster than
ADMM- or AMA-based convex clustering algorithms by one to two orders of
magnitude. The memory space required by Sproga is less than that required by
ADMM and AMA by at least one order of magnitude. Computer simulations and real
data analysis show that Sproga outperforms several well known clustering
algorithms including K-means and hierarchical clustering. The efficiency and
superior performance of our algorithm will help convex clustering to find its
wide application.
Related papers
- An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Hybridization of K-means with improved firefly algorithm for automatic
clustering in high dimension [0.0]
We have implemented the Silhouette and Elbow methods with PCA to find an optimal number of clusters.
In the Firefly algorithm, the entire population is automatically subdivided into sub-populations that decrease the convergence rate speed and trapping to local minima.
Our study proposed an enhanced firefly, i.e., a hybridized K-means with an ODFA model for automatic clustering.
arXiv Detail & Related papers (2023-02-09T18:43:10Z) - Regularization and Optimization in Model-Based Clustering [4.096453902709292]
k-means algorithm variants essentially fit a mixture of identical spherical Gaussians to data that vastly deviates from such a distribution.
We develop more effective optimization algorithms for general GMMs, and we combine these algorithms with regularization strategies that avoid overfitting.
These results shed new light on the current status quo between GMM and k-means methods and suggest the more frequent use of general GMMs for data exploration.
arXiv Detail & Related papers (2023-02-05T18:22:29Z) - Convex Clustering through MM: An Efficient Algorithm to Perform
Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme.
With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z) - Late Fusion Multi-view Clustering via Global and Local Alignment
Maximization [61.89218392703043]
Multi-view clustering (MVC) optimally integrates complementary information from different views to improve clustering performance.
Most of existing approaches directly fuse multiple pre-specified similarities to learn an optimal similarity matrix for clustering.
We propose late fusion MVC via alignment to address these issues.
arXiv Detail & Related papers (2022-08-02T01:49:31Z) - An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering [0.5801044612920815]
We present a new branch-and-bound algorithm for semi-supervised MSSC.
Background knowledge is incorporated as pairwise must-link and cannot-link constraints.
For the first time, the proposed global optimization algorithm efficiently manages to solve real-world instances up to 800 data points.
arXiv Detail & Related papers (2021-11-30T17:08:53Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z) - Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems.
Our results imply an improved algorithm for the Sample and Aggregate privacy framework.
One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z) - Certifying clusters from sum-of-norms clustering [13.747619681451875]
We present a clustering test that identifies and certifies the correct cluster assignment from an approximate solution.
We show the correct cluster assignment is guaranteed to be certified by a primal-dual path following algorithm after sufficiently many iterations.
arXiv Detail & Related papers (2020-06-19T20:26:26Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.