Related papers: Rectified Euler k-means and Beyond

Rectified Euler k-means and Beyond

URL: http://arxiv.org/abs/2108.03081v1
Date: Fri, 6 Aug 2021 12:35:28 GMT
Title: Rectified Euler k-means and Beyond
Authors: Yunxia Lin, Songcan chen
Abstract summary: Euler k-means (EulerK) first maps data onto the unit hyper-sphere surface of equi-dimensional space via a complex mapping. EulerK is also robust to noises and outliers. We propose two Rectified Euler k-means methods, i.e., REK1 and REK2, which retain the merits of EulerK while acquire real centroids residing on the mapped space to better characterize the data structures.
Score: 27.471041765364884
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Euler k-means (EulerK) first maps data onto the unit hyper-sphere surface of equi-dimensional space via a complex mapping which induces the robust Euler kernel and next employs the popular $k$-means. Consequently, besides enjoying the virtues of k-means such as simplicity and scalability to large data sets, EulerK is also robust to noises and outliers. Although so, the centroids captured by EulerK deviate from the unit hyper-sphere surface and thus in strict distributional sense, actually are outliers. This weird phenomenon also occurs in some generic kernel clustering methods. Intuitively, using such outlier-like centroids should not be quite reasonable but it is still seldom attended. To eliminate the deviation, we propose two Rectified Euler k-means methods, i.e., REK1 and REK2, which retain the merits of EulerK while acquire real centroids residing on the mapped space to better characterize the data structures. Specifically, REK1 rectifies EulerK by imposing the constraint on the centroids while REK2 views each centroid as the mapped image from a pre-image in the original space and optimizes these pre-images in Euler kernel induced space. Undoubtedly, our proposed REKs can methodologically be extended to solve problems of such a category. Finally, the experiments validate the effectiveness of REK1 and REK2.

Related papers

Radius-Guided Post-Clustering for Shape-Aware, Scalable Refinement of k-Means Results [1.9580473532948401]
After standard k-means, each cluster center is assigned a radius (the distance to its assigned point), and clusters whose radii overlap are merged. This post-processing step loosens the requirement for exact k long as k is. The method can often reconstruct non-estimated shapes over meaningful merges.
arXiv Detail & Related papers (2025-04-28T22:30:53Z)
An Approach Towards Learning K-means-friendly Deep Latent Representation [0.6798775532273751]
Clustering is a long-standing problem area in data mining. With the advent of deep neural networks, a common approach to this problem is to map the data to some latent space of comparatively lower dimensions. A well-known centroid-based clustering algorithm is K-means.
arXiv Detail & Related papers (2024-11-29T06:28:38Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
KPZ scaling from the Krylov space [83.88591755871734]
Recently, a superdiffusion exhibiting the Kardar-Parisi-Zhang scaling in late-time correlators and autocorrelators has been reported. Inspired by these results, we explore the KPZ scaling in correlation functions using their realization in the Krylov operator basis.
arXiv Detail & Related papers (2024-06-04T20:57:59Z)
Parametric entropy based Cluster Centriod Initialization for k-means clustering of various Image datasets [0.0]
k-means is one of the most employed yet simple algorithms for cluster analysis. In this paper, we conduct an analysis of the performance of k-means on image data by employing parametric entropies. We propose the best fitting entropy measures for general image datasets.
arXiv Detail & Related papers (2023-08-15T11:28:02Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds [17.226362076527764]
Centroid based clustering methods such as k-means, k-medoids and k-centers are heavily applied as a go-to tool in exploratory data analysis. In many cases, those methods are used to obtain representative centroids of the data manifold for visualization or summarization of a dataset. We propose to remedy such a scenario by introducing a maximal radius constraint $r$ on the clusters formed by centroids.
arXiv Detail & Related papers (2022-03-04T18:59:02Z)
Accelerating Spherical k-Means [1.7188280334580197]
Spherical k-means is a widely used clustering algorithm for sparse and high-dimensional data such as document vectors. Many acceleration techniques, such as the algorithms of Elkan and Hamerly, rely on the triangle inequality of Euclidean distances. In this paper, we incorporate the Elkan and Hamerly accelerations to the spherical k-means algorithm working directly with the Cosines instead of Euclidean distances.
arXiv Detail & Related papers (2021-07-08T19:24:09Z)
Adaptive Explicit Kernel Minkowski Weighted K-means [1.3535770763481905]
The kernel K-means, which extends K-means into the kernel space, is able to capture nonlinear structures and identify arbitrarily shaped clusters. This paper proposes a method to combine the advantages of the linear and nonlinear approaches by using driven corresponding approximate finite-dimensional feature maps.
arXiv Detail & Related papers (2020-12-04T19:14:09Z)
(k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories. We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW) We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z)
Debiased Sinkhorn barycenters [110.79706180350507]
Entropy regularization in optimal transport (OT) has been the driver of many recent interests for Wasserstein metrics and barycenters in machine learning. We show how this bias is tightly linked to the reference measure that defines the entropy regularizer. We propose debiased Wasserstein barycenters that preserve the best of both worlds: fast Sinkhorn-like iterations without entropy smoothing.
arXiv Detail & Related papers (2020-06-03T23:06:02Z)
Ball k-means [53.89505717006118]
The Ball k-means algorithm uses a ball to describe a cluster, focusing on reducing the point-centroid distance computation. The fast speed, no extra parameters and simple design of the Ball k-means make it an all-around replacement of the naive k-means algorithm.
arXiv Detail & Related papers (2020-05-02T10:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.