Distribution-Free Statistical Dispersion Control for Societal
Applications
- URL: http://arxiv.org/abs/2309.13786v2
- Date: Wed, 6 Mar 2024 14:18:52 GMT
- Title: Distribution-Free Statistical Dispersion Control for Societal
Applications
- Authors: Zhun Deng, Thomas P. Zollo, Jake C. Snell, Toniann Pitassi, Richard
Zemel
- Abstract summary: Explicit finite-sample statistical guarantees on model performance are an important ingredient in responsible machine learning.
Previous work has focused mainly on bounding either the expected loss of a predictor or the probability that an individual prediction will incur a loss value in a specified range.
We propose a simple yet flexible framework that allows us to handle a much richer class of statistical functionals beyond previous work.
- Score: 16.43522470711466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explicit finite-sample statistical guarantees on model performance are an
important ingredient in responsible machine learning. Previous work has focused
mainly on bounding either the expected loss of a predictor or the probability
that an individual prediction will incur a loss value in a specified range.
However, for many high-stakes applications, it is crucial to understand and
control the dispersion of a loss distribution, or the extent to which different
members of a population experience unequal effects of algorithmic decisions. We
initiate the study of distribution-free control of statistical dispersion
measures with societal implications and propose a simple yet flexible framework
that allows us to handle a much richer class of statistical functionals beyond
previous work. Our methods are verified through experiments in toxic comment
detection, medical imaging, and film recommendation.
Related papers
- Multi-Source Conformal Inference Under Distribution Shift [41.701790856201036]
We consider the problem of obtaining distribution-free prediction intervals for a target population, leveraging multiple potentially biased data sources.
We derive the efficient influence functions for the quantiles of unobserved outcomes in the target and source populations.
We propose a data-adaptive strategy to upweight informative data sources for efficiency gain and downweight non-informative data sources for bias reduction.
arXiv Detail & Related papers (2024-05-15T13:33:09Z) - Selective Prediction for Semantic Segmentation using Post-Hoc Confidence Estimation and Its Performance under Distribution Shift [1.2903829793534267]
We propose a novel image-level confidence measure tailored for semantic segmentation.
Our findings show that post-hoc confidence estimators offer a cost-effective approach to reducing the impacts of distribution shift.
arXiv Detail & Related papers (2024-02-16T13:14:12Z) - Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data [62.56890808004615]
We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making.
We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.
arXiv Detail & Related papers (2023-12-17T00:42:42Z) - Targeted Machine Learning for Average Causal Effect Estimation Using the
Front-Door Functional [3.0232957374216953]
evaluating the average causal effect (ACE) of a treatment on an outcome often involves overcoming the challenges posed by confounding factors in observational studies.
Here, we introduce novel estimation strategies for the front-door criterion based on the targeted minimum loss-based estimation theory.
We demonstrate the applicability of these estimators to analyze the effect of early stage academic performance on future yearly income.
arXiv Detail & Related papers (2023-12-15T22:04:53Z) - Conformal Loss-Controlling Prediction [23.218535051437588]
Conformal prediction is a learning framework controlling prediction coverage of prediction sets.
This work proposes a learning framework named conformal loss-controlling prediction, which extends conformal prediction to the situation where the value of a loss function needs to be controlled.
arXiv Detail & Related papers (2023-01-06T08:58:49Z) - Quantile Risk Control: A Flexible Framework for Bounding the Probability
of High-Loss Predictions [11.842061466957686]
We propose a flexible framework to produce a family of bounds on quantiles of the loss distribution incurred by a predictor.
We show that a quantile is an informative way of quantifying predictive performance, and that our framework applies to a variety of quantile-based metrics.
arXiv Detail & Related papers (2022-12-27T22:08:29Z) - Interpretable Social Anchors for Human Trajectory Forecasting in Crowds [84.20437268671733]
We propose a neural network-based system to predict human trajectory in crowds.
We learn interpretable rule-based intents, and then utilise the expressibility of neural networks to model scene-specific residual.
Our architecture is tested on the interaction-centric benchmark TrajNet++.
arXiv Detail & Related papers (2021-05-07T09:22:34Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear
Analysis [68.8204255655161]
We show that uncertainty measures based on Monte-Carlo dropout in the context of a human-in-the-loop system increase the system's transparency and performance.
A simulation study demonstrates that the uncertainty-based human-in-the-loop system increases performance for different levels of human involvement.
arXiv Detail & Related papers (2020-07-14T15:47:37Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications.
Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions.
The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.