Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models
- URL: http://arxiv.org/abs/2402.15432v2
- Date: Wed, 17 Jul 2024 08:43:29 GMT
- Title: Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models
- Authors: Maximilien Dreveton, Alperen Gözeten, Matthias Grossglauser, Patrick Thiran,
- Abstract summary: We first establish a universal lower bound for the error rate in clustering any mixture model.
We then demonstrate that iterative algorithms attain this lower bound in mixture models with sub-exponential tails.
For datasets better modelled by Poisson or Negative Binomial mixtures, we study mixture models whose distributions belong to an exponential family.
In such mixtures, we establish that Bregman hard clustering, a variant of Lloyd's algorithm employing a Bregman divergence, is rate optimal.
- Score: 8.097200145973389
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering is a pivotal challenge in unsupervised machine learning and is often investigated through the lens of mixture models. The optimal error rate for recovering cluster labels in Gaussian and sub-Gaussian mixture models involves ad hoc signal-to-noise ratios. Simple iterative algorithms, such as Lloyd's algorithm, attain this optimal error rate. In this paper, we first establish a universal lower bound for the error rate in clustering any mixture model, expressed through a Chernoff divergence, a more versatile measure of model information than signal-to-noise ratios. We then demonstrate that iterative algorithms attain this lower bound in mixture models with sub-exponential tails, notably emphasizing location-scale mixtures featuring Laplace-distributed errors. Additionally, for datasets better modelled by Poisson or Negative Binomial mixtures, we study mixture models whose distributions belong to an exponential family. In such mixtures, we establish that Bregman hard clustering, a variant of Lloyd's algorithm employing a Bregman divergence, is rate optimal.
Related papers
- Adaptive Fuzzy C-Means with Graph Embedding [84.47075244116782]
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
We propose a novel FCM based clustering model that is capable of automatically learning an appropriate membership degree hyper- parameter value.
arXiv Detail & Related papers (2024-05-22T08:15:50Z) - Learning a Gaussian Mixture for Sparsity Regularization in Inverse
Problems [2.375943263571389]
In inverse problems, the incorporation of a sparsity prior yields a regularization effect on the solution.
We propose a probabilistic sparsity prior formulated as a mixture of Gaussians, capable of modeling sparsity with respect to a generic basis.
We put forth both a supervised and an unsupervised training strategy to estimate the parameters of this network.
arXiv Detail & Related papers (2024-01-29T22:52:57Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - Optimal Clustering of Discrete Mixtures: Binomial, Poisson, Block
Models, and Multi-layer Networks [9.57586103097079]
We study the fundamental limit of clustering networks when a multi-layer network is present.
Under the mixture multi-layer block model (MMSBM), we show that the minimax optimal network clustering error rate takes an exponential form.
We propose a novel two-stage network clustering method including a tensor-based algorithm involving both node and sample splitting.
arXiv Detail & Related papers (2023-11-27T07:48:50Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Optimal Clustering by Lloyd Algorithm for Low-Rank Mixture Model [12.868722327487752]
We propose a low-rank mixture model (LrMM) to treat matrix-valued observations.
A computationally efficient clustering method is designed by integrating Lloyd's algorithm and low-rank approximation.
Our method outperforms others in the literature on real-world datasets.
arXiv Detail & Related papers (2022-07-11T03:16:10Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Optimal Clustering in Anisotropic Gaussian Mixture Models [3.5590836605011047]
We study the clustering task under anisotropic Gaussian Mixture Models.
We characterize the dependence of signal-to-noise ratios on the cluster centers.
arXiv Detail & Related papers (2021-01-14T00:31:52Z) - Clustering of non-Gaussian data by variational Bayes for normal inverse
Gaussian mixture models [0.0]
In practical situations, there are many non-Gaussian data that are heavy-tailed and/or asymmetric.
For NIG mixture models, both expectation-maximization method and variational Bayesian (VB) algorithms have been proposed.
We propose another VB algorithm for NIG mixture that improves on the shortcomings.
We also propose an extension of Dirichlet process mixture models to overcome the difficulty in determining the number of clusters.
arXiv Detail & Related papers (2020-09-13T14:13:27Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.