Estimating Fr\'echet bounds for validating programmatic weak supervision
- URL: http://arxiv.org/abs/2312.04601v1
- Date: Thu, 7 Dec 2023 07:15:11 GMT
- Title: Estimating Fr\'echet bounds for validating programmatic weak supervision
- Authors: Felipe Maia Polo, Mikhail Yurochkin, Moulinath Banerjee, Subha Maity,
Yuekai Sun
- Abstract summary: We develop methods for estimating Fr'eche's bounds on (possibly high-dimensional) distribution classes in which some variables are continuous-valued.
We demonstrate the usefulness of our algorithms by evaluating the performance of machine learning (ML) models trained with programmatic weak supervision (PWS)
- Score: 50.13475056199486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop methods for estimating Fr\'echet bounds on (possibly
high-dimensional) distribution classes in which some variables are
continuous-valued. We establish the statistical correctness of the computed
bounds under uncertainty in the marginal constraints and demonstrate the
usefulness of our algorithms by evaluating the performance of machine learning
(ML) models trained with programmatic weak supervision (PWS). PWS is a
framework for principled learning from weak supervision inputs (e.g.,
crowdsourced labels, knowledge bases, pre-trained models on related tasks,
etc), and it has achieved remarkable success in many areas of science and
engineering. Unfortunately, it is generally difficult to validate the
performance of ML models trained with PWS due to the absence of labeled data.
Our algorithms address this issue by estimating sharp lower and upper bounds
for performance metrics such as accuracy/recall/precision/F1 score.
Related papers
- Worst-Case Convergence Time of ML Algorithms via Extreme Value Theory [8.540426791244533]
This paper leverages the statistics of extreme values to predict the worst-case convergence times of machine learning algorithms.
Timing is a critical non-functional property of ML systems, and providing the worst-case converge times is essential to guarantee the availability of ML and its services.
arXiv Detail & Related papers (2024-04-10T17:05:12Z) - LTAU-FF: Loss Trajectory Analysis for Uncertainty in Atomistic Force Fields [5.396675151318325]
Model ensembles are effective tools for estimating prediction uncertainty in deep learning atomistic force fields.
However, their widespread adoption is hindered by high computational costs and overconfident error estimates.
We address these challenges by leveraging distributions of per-sample errors obtained during training and employing a distance-based similarity search in the model latent space.
Our method, which we call LTAU, efficiently estimates the full probability distribution function (PDF) of errors for any test point using the logged training errors.
arXiv Detail & Related papers (2024-02-01T18:50:42Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Making Pre-trained Language Models both Task-solvers and
Self-calibrators [52.98858650625623]
Pre-trained language models (PLMs) serve as backbones for various real-world systems.
Previous work shows that introducing an extra calibration task can mitigate this issue.
We propose a training algorithm LM-TOAST to tackle the challenges.
arXiv Detail & Related papers (2023-07-21T02:51:41Z) - Robustness, Evaluation and Adaptation of Machine Learning Models in the
Wild [4.304803366354879]
We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models.
A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses.
arXiv Detail & Related papers (2023-03-05T21:41:16Z) - Modeling Disagreement in Automatic Data Labelling for Semi-Supervised
Learning in Clinical Natural Language Processing [2.016042047576802]
We investigate the quality of uncertainty estimates from a range of current state-of-the-art predictive models applied to the problem of observation detection in radiology reports.
arXiv Detail & Related papers (2022-05-29T20:20:49Z) - FORML: Learning to Reweight Data for Fairness [2.105564340986074]
We introduce Fairness Optimized Reweighting via Meta-Learning (FORML)
FORML balances fairness constraints and accuracy by jointly optimizing training sample weights and a neural network's parameters.
We show that FORML improves equality of opportunity fairness criteria over existing state-of-the-art reweighting methods by approximately 1% on image classification tasks and by approximately 5% on a face prediction task.
arXiv Detail & Related papers (2022-02-03T17:36:07Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.