Related papers: Learning Group Importance using the Differentiable Hypergeometric Distribution

Learning Group Importance using the Differentiable Hypergeometric Distribution

URL: http://arxiv.org/abs/2203.01629v5
Date: Mon, 8 May 2023 07:56:13 GMT
Title: Learning Group Importance using the Differentiable Hypergeometric Distribution
Authors: Thomas M. Sutter, Laura Manduchi, Alain Ryser, Julia E. Vogt
Abstract summary: partitioning elements into subsets of unknown sizes is essential in many applications. In this work, we propose the differentiable hypergeometric distribution. We show that we can learn the size of subsets in two typical applications: weakly-supervised learning and clustering.
Score: 16.30064635746202
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Partitioning a set of elements into subsets of a priori unknown sizes is essential in many applications. These subset sizes are rarely explicitly learned - be it the cluster sizes in clustering applications or the number of shared versus independent generative latent factors in weakly-supervised learning. Probability distributions over correct combinations of subset sizes are non-differentiable due to hard constraints, which prohibit gradient-based optimization. In this work, we propose the differentiable hypergeometric distribution. The hypergeometric distribution models the probability of different group sizes based on their relative importance. We introduce reparameterizable gradients to learn the importance between groups and highlight the advantage of explicitly learning the size of subsets in two typical applications: weakly-supervised learning and clustering. In both applications, we outperform previous approaches, which rely on suboptimal heuristics to model the unknown size of groups.

Related papers

The generalized underlap coefficient with an application in clustering [2.140305355990306]
Underlap coefficient (UNL) is a multi-group separation measure.<n>We establish key properties of the UNL and provide an explicit connection to total variation.<n>We illustrate the application of the UNL in clustering using two real world datasets.
arXiv Detail & Related papers (2026-02-23T03:35:08Z)
Making Foundation Models Probabilistic via Singular Value Ensembles [56.4174499669573]
Foundation models have become a dominant paradigm in machine learning, achieving remarkable performance across diverse tasks through large-scale pretraining.<n>Standard approach to quantifying uncertainty, training an ensemble of independent models, incurs prohibitive computational costs that scale linearly with ensemble size.<n>We propose Singular Value Ensemble (SVE), a parameter-efficient implicit ensemble method that builds on a simple, but powerful core assumption.<n>We show that SVE uncertainty quantification achieves comparable to explicit deep ensembles while increasing the parameter count of the base model by less than 1%.
arXiv Detail & Related papers (2026-01-29T18:07:18Z)
Towards Learnable Anchor for Deep Multi-View Clustering [49.767879678193005]
In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time. With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering. Experiments on several datasets demonstrate superior performance and efficiency of DMAC compared to state-of-the-art competitors.
arXiv Detail & Related papers (2025-03-16T09:38:11Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Causal K-Means Clustering [5.087519744951637]
Causal k-Means Clustering harnesses the widely-used k-means clustering algorithm to uncover the unknown subgroup structure. We present a plug-in estimator which is simple and readily implementable using off-the-shelf algorithms. Our proposed methods are especially useful for modern outcome-wide studies with multiple treatment levels.
arXiv Detail & Related papers (2024-05-05T23:59:51Z)
Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain. We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters. We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z)
A One-shot Framework for Distributed Clustered Learning in Heterogeneous Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments. One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees. For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z)
Supervised Multivariate Learning with Simultaneous Feature Auto-grouping and Dimension Reduction [7.093830786026851]
This paper proposes a novel clustered reduced-rank learning framework. It imposes two joint matrix regularizations to automatically group the features in constructing predictive factors. It is more interpretable than low-rank modeling and relaxes the stringent sparsity assumption in variable selection.
arXiv Detail & Related papers (2021-12-17T20:11:20Z)
Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning. Recent work has established lower bounds against the class of low-degree methods. We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z)
Learning Debiased and Disentangled Representations for Semantic Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z)
Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering. Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation [0.0]
We propose a clustering-friendly representation learning method using instance discrimination and feature decorrelation. In evaluations of image clustering using CIFAR-10 and ImageNet-10, our method achieves accuracy of 81.5% and 95.4%, respectively.
arXiv Detail & Related papers (2021-05-31T22:59:31Z)
Statistical power for cluster analysis [0.0]
Cluster algorithms are increasingly popular in biomedical research. We estimate power and accuracy for common analysis through simulation. We recommend that researchers only apply cluster analysis when large subgroup separation is expected.
arXiv Detail & Related papers (2020-03-01T02:43:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.