KL Divergence Estimation with Multi-group Attribution
- URL: http://arxiv.org/abs/2202.13576v1
- Date: Mon, 28 Feb 2022 06:54:10 GMT
- Title: KL Divergence Estimation with Multi-group Attribution
- Authors: Parikshit Gopalan, Nina Narodytska, Omer Reingold, Vatsal Sharan, Udi
Wieder
- Abstract summary: Estimating the Kullback-Leibler (KL) divergence between two distributions is well-studied in machine learning and information theory.
Motivated by considerations of multi-group fairness, we seek KL divergence estimates that accurately reflect the contributions of sub-populations.
- Score: 25.7757954754825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating the Kullback-Leibler (KL) divergence between two distributions
given samples from them is well-studied in machine learning and information
theory. Motivated by considerations of multi-group fairness, we seek KL
divergence estimates that accurately reflect the contributions of
sub-populations to the overall divergence. We model the sub-populations coming
from a rich (possibly infinite) family $\mathcal{C}$ of overlapping subsets of
the domain. We propose the notion of multi-group attribution for $\mathcal{C}$,
which requires that the estimated divergence conditioned on every
sub-population in $\mathcal{C}$ satisfies some natural accuracy and fairness
desiderata, such as ensuring that sub-populations where the model predicts
significant divergence do diverge significantly in the two distributions. Our
main technical contribution is to show that multi-group attribution can be
derived from the recently introduced notion of multi-calibration for importance
weights [HKRR18, GRSW21]. We provide experimental evidence to support our
theoretical results, and show that multi-group attribution provides better KL
divergence estimates when conditioned on sub-populations than other popular
algorithms.
Related papers
- Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z) - Exploration-free Algorithms for Multi-group Mean Estimation [7.480522058240762]
We address the problem of multi-group mean estimation, which seeks to allocate a finite sampling budget across multiple groups to obtain uniformly accurate estimates of their means.<n>Unlike classical multi-armed bandits, whose objective is to minimize regret by identifying and exploiting the best arm, the optimal allocation in this setting requires sampling every group on the order of $Theta(T)$ times.
arXiv Detail & Related papers (2025-10-12T00:20:30Z) - MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources [113.33902847941941]
Variance-Aware Sampling (VAS) is a data selection strategy guided by Variance Promotion Score (VPS)<n>We release large-scale, carefully curated resources containing 1.6M long CoT cold-start data and 15k RL QA pairs.<n> Experiments across mathematical reasoning benchmarks demonstrate the effectiveness of both the curated data and the proposed VAS.
arXiv Detail & Related papers (2025-09-25T14:58:29Z) - Generalized Cauchy-Schwarz Divergence and Its Deep Learning Applications [37.349358118385155]
Divergence measures play a central role and become increasingly essential in deep learning.
We introduce a new measure tailored for multiple distributions named the generalized Cauchy-Schwarz divergence (GCSD)
arXiv Detail & Related papers (2024-05-07T07:07:44Z) - How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance [64.1656365676171]
Group imbalance has been a known problem in empirical risk minimization.
This paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance.
arXiv Detail & Related papers (2024-03-12T04:38:05Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Bandit Pareto Set Identification: the Fixed Budget Setting [12.326452468513228]
We study a pure exploration problem in a multi-armed bandit model.
The goal is to identify the distributions whose mean is not uniformly worse than that of another distribution.
arXiv Detail & Related papers (2023-11-07T13:43:18Z) - Understanding Contrastive Learning via Distributionally Robust
Optimization [29.202594242468678]
This study reveals the inherent tolerance of contrastive learning (CL) towards sampling bias, wherein negative samples may encompass similar semantics (eg labels)
We bridge this research gap by analyzing CL through the lens of distributionally robust optimization (DRO), yielding several key insights.
We also identify CL's potential shortcomings, including over-conservatism and sensitivity to outliers, and introduce a novel Adjusted InfoNCE loss (ADNCE) to mitigate these issues.
arXiv Detail & Related papers (2023-10-17T07:32:59Z) - Reweighted Mixup for Subpopulation Shift [63.1315456651771]
Subpopulation shift exists in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.
Importance reweighting is a classical and effective way to handle the subpopulation shift.
We propose a simple yet practical framework, called reweighted mixup, to mitigate the overfitting issue.
arXiv Detail & Related papers (2023-04-09T03:44:50Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Unintended Selection: Persistent Qualification Rate Disparities and
Interventions [6.006936459950188]
We study the dynamics of group-level disparities in machine learning.
In particular, we desire models that do not suppose inherent differences between artificial groups of people.
We show that differences in qualification rates between subpopulations can persist indefinitely for a set of non-trivial equilibrium states.
arXiv Detail & Related papers (2021-11-01T18:53:54Z) - Robust Learning of Optimal Auctions [84.13356290199603]
We study the problem of learning revenue-optimal multi-bidder auctions from samples when the samples of bidders' valuations can be adversarially corrupted or drawn from distributions that are adversarially perturbed.
We propose new algorithms that can learn a mechanism whose revenue is nearly optimal simultaneously for all true distributions'' that are $alpha$-close to the original distribution in Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2021-07-13T17:37:21Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.