Related papers: K-expectiles clustering

K-expectiles clustering

URL: http://arxiv.org/abs/2103.09329v1
Date: Tue, 16 Mar 2021 21:14:56 GMT
Title: K-expectiles clustering
Authors: Bingling Wang, Yinxing Li, Wolfgang Karl H\"ardle
Abstract summary: We propose a novel partitioning clustering algorithm based on expectiles. We suggest two schemes: fixed $tau$ clustering, and adaptive $tau$ clustering.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: $K$-means clustering is one of the most widely-used partitioning algorithm in cluster analysis due to its simplicity and computational efficiency. However, $K$-means does not provide an appropriate clustering result when applying to data with non-spherically shaped clusters. We propose a novel partitioning clustering algorithm based on expectiles. The cluster centers are defined as multivariate expectiles and clusters are searched via a greedy algorithm by minimizing the within cluster '$\tau$ -variance'. We suggest two schemes: fixed $\tau$ clustering, and adaptive $\tau$ clustering. Validated by simulation results, this method beats both $K$-means and spectral clustering on data with asymmetric shaped clusters, or clusters with a complicated structure, including asymmetric normal, beta, skewed $t$ and $F$ distributed clusters. Applications of adaptive $\tau$ clustering on crypto-currency (CC) market data are provided. One finds that the expectiles clusters of CC markets show the phenomena of an institutional investors dominated market. The second application is on image segmentation. compared to other center based clustering methods, the adaptive $\tau$ cluster centers of pixel data can better capture and describe the features of an image. The fixed $\tau$ clustering brings more flexibility on segmentation with a decent accuracy.

Related papers

Generalization Performance of Ensemble Clustering: From Theory to Algorithm [57.176040163699554]
This paper focuses on generalization error, excess risk and consistency in ensemble clustering.<n>By assigning varying weights to finite clusterings, we minimize the error between the empirical average clusterings and their expectation.<n>We instantiate our theory to develop a new ensemble clustering algorithm.
arXiv Detail & Related papers (2025-06-01T09:34:52Z)
Guaranteed Recovery of Unambiguous Clusters [7.011239860967789]
Clustering is often a challenging problem because of the inherent ambiguity in what the "correct" clustering should be. In this paper we propose an information-theoretic characterization and design an algorithm that recovers the clustering whenever it is unambiguous.
arXiv Detail & Related papers (2025-01-22T18:51:25Z)
Exponentially Consistent Nonparametric Clustering of Data Streams [4.2603120588176635]
We consider nonparametric clustering of $M$ independent and identically distributed (i.i.d.) data streams generated from unknown distributions. Existing results on nonparametric clustering algorithms, like single linkage-based (SLINK) clustering and $k$-medoids distribution clustering, assume that the maximum intra-cluster distance ($d_L$) is smaller than the minimum inter-cluster distance ($d_H$)
arXiv Detail & Related papers (2024-11-21T08:18:04Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Asymptotics for The $k$-means [0.6091702876917281]
The $k$-means is one of the most important unsupervised learning techniques in statistics and computer science. The proposed clustering consistency is more appropriate than the previous criterion consistency for the clustering methods. It is found that the proposed $k$-means method has lower clustering error rates and is more robust to small clusters and outliers.
arXiv Detail & Related papers (2022-11-18T03:36:58Z)
Convex Clustering through MM: An Efficient Algorithm to Perform Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme. With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z)
Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC) In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning. For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z)
Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem. We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z)
Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries [22.672233769934845]
We investigate the problem of exact cluster dependence using oracle queries. We show that if we are allowed to make $(k2 + k )$, an additional queries can recover different different and arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary arbitrary
arXiv Detail & Related papers (2021-01-31T18:00:29Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
Spectral Clustering with Smooth Tiny Clusters [14.483043753721256]
We propose a novel clustering algorithm, which con-siders the smoothness of data for the first time. Our key idea is to cluster tiny clusters, whose centers constitute smooth graphs. Although in this paper, we singly focus on multi-scale situations, the idea of data smoothness can certainly be extended to any clustering algorithms.
arXiv Detail & Related papers (2020-09-10T05:21:20Z)
Exact Recovery of Mangled Clusters with Same-Cluster Queries [20.03712152278538]
We study the cluster recovery problem in the semi-supervised active clustering framework. We design an algorithm that, given $n$ points to be partitioned into $k$ clusters, uses $O(k3 ln k ln n)$ oracle queries and $tildeO(kn + k3)$ time to recover the clustering with zero misclassification error.
arXiv Detail & Related papers (2020-06-08T15:27:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.