Entropy regularization in probabilistic clustering
- URL: http://arxiv.org/abs/2307.10065v1
- Date: Wed, 19 Jul 2023 15:36:40 GMT
- Title: Entropy regularization in probabilistic clustering
- Authors: Beatrice Franzolini and Giovanni Rebaudo
- Abstract summary: We propose a novel Bayesian estimator of the clustering configuration.
The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Bayesian nonparametric mixture models are widely used to cluster
observations. However, one major drawback of the approach is that the estimated
partition often presents unbalanced clusters' frequencies with only a few
dominating clusters and a large number of sparsely-populated ones. This feature
translates into results that are often uninterpretable unless we accept to
ignore a relevant number of observations and clusters. Interpreting the
posterior distribution as penalized likelihood, we show how the unbalance can
be explained as a direct consequence of the cost functions involved in
estimating the partition. In light of our findings, we propose a novel Bayesian
estimator of the clustering configuration. The proposed estimator is equivalent
to a post-processing procedure that reduces the number of sparsely-populated
clusters and enhances interpretability. The procedure takes the form of
entropy-regularization of the Bayesian estimate. While being computationally
convenient with respect to alternative strategies, it is also theoretically
justified as a correction to the Bayesian loss function used for point
estimation and, as such, can be applied to any posterior distribution of
clusters, regardless of the specific model used.
Related papers
- Bayesian Renormalization [68.8204255655161]
We present a fully information theoretic approach to renormalization inspired by Bayesian statistical inference.
The main insight of Bayesian Renormalization is that the Fisher metric defines a correlation length that plays the role of an emergent RG scale.
We provide insight into how the Bayesian Renormalization scheme relates to existing methods for data compression and data generation.
arXiv Detail & Related papers (2023-05-17T18:00:28Z) - A Statistical Model for Predicting Generalization in Few-Shot
Classification [6.158812834002346]
We introduce a Gaussian model of the feature distribution to predict the generalization error.
We show that our approach outperforms alternatives such as the leave-one-out cross-validation strategy.
arXiv Detail & Related papers (2022-12-13T10:21:15Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - A generalized Bayes framework for probabilistic clustering [3.3194866396158]
Loss-based clustering methods, such as k-means and its variants, are standard tools for finding groups in data.
Model-based clustering based on mixture models provides an alternative, but such methods face computational problems and large sensitivity to the choice of kernel.
This article proposes a generalized Bayes framework that bridges between these two paradigms through the use of Gibbs posteriors.
arXiv Detail & Related papers (2020-06-09T18:49:32Z) - Sparse Cholesky covariance parametrization for recovering latent
structure in ordered data [1.5349431582672617]
We focus on arbitrary zero patterns in the Cholesky factor of a covariance matrix.
For the ordered scenario, we propose a novel estimation method that is based on matrix loss penalization.
We give guidelines, based on the empirical results, about which of the methods analysed is more appropriate for each setting.
arXiv Detail & Related papers (2020-06-02T08:35:00Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z) - Robust M-Estimation Based Bayesian Cluster Enumeration for Real
Elliptically Symmetric Distributions [5.137336092866906]
Robustly determining optimal number of clusters in a data set is an essential factor in a wide range of applications.
This article generalizes so that it can be used with any arbitrary Really Symmetric (RES) distributed mixture model.
We derive a robust criterion for data sets with finite sample size, and also provide an approximation to reduce the computational cost at large sample sizes.
arXiv Detail & Related papers (2020-05-04T11:44:49Z) - Batch Stationary Distribution Estimation [98.18201132095066]
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions.
We propose a consistent estimator that is based on recovering a correction ratio function over the given data.
arXiv Detail & Related papers (2020-03-02T09:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.