Strong Consistency for a Class of Adaptive Clustering Procedures
- URL: http://arxiv.org/abs/2202.13423v1
- Date: Sun, 27 Feb 2022 18:56:41 GMT
- Title: Strong Consistency for a Class of Adaptive Clustering Procedures
- Authors: Adam Quinn Jaffe
- Abstract summary: We show that all clustering procedures in this class are strongly consistent under IID samples.
In the adaptive setting, our work provides a strong consistency result that is the first of its kind.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a class of clustering procedures which includes $k$-means and
$k$-medians, as well as variants of these where the domain of the cluster
centers can be chosen adaptively (for example, $k$-medoids) and where the
number of cluster centers can be chosen adaptively (for example, according to
the elbow method). In the non-parametric setting and assuming only the
finiteness of certain moments, we show that all clustering procedures in this
class are strongly consistent under IID samples. Our method of proof is to
directly study the continuity of various deterministic maps associated with
these clustering procedures, and to show that strong consistency simply
descends from analogous strong consistency of the empirical measures. In the
adaptive setting, our work provides a strong consistency result that is the
first of its kind. In the non-adaptive setting, our work strengthens Pollard's
classical result by dispensing with various unnecessary technical hypotheses,
by upgrading the particular notion of strong consistency, and by using the same
methods to prove further limit theorems.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis [28.18800845199871]
We present a novel non-rigid point set registration method inspired by unsupervised clustering analysis.
Our method achieves high accuracy results across various scenarios and surpasses competitors by a significant margin.
arXiv Detail & Related papers (2024-06-27T01:16:44Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS)
Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability.
The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z) - Stable Cluster Discrimination for Deep Clustering [7.175082696240088]
Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution.
The coupled objective implies a trivial solution that all instances collapse to the uniform features.
In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering.
A novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly.
arXiv Detail & Related papers (2023-11-24T06:43:26Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.