Related papers: Data-light Uncertainty Set Merging with Admissibility: Synthetics, Aggregation, and Test Inversion

Data-light Uncertainty Set Merging with Admissibility: Synthetics, Aggregation, and Test Inversion

URL: http://arxiv.org/abs/2410.12201v2
Date: Sat, 31 May 2025 05:38:16 GMT
Title: Data-light Uncertainty Set Merging with Admissibility: Synthetics, Aggregation, and Test Inversion
Authors: Shenghao Qin, Jianliang He, Qi Kuang, Bowen Gang, Yin Xia,
Abstract summary: This article introduces a Synthetics, Aggregation, and Test inversion (SAT) approach for merging diverse and potentially dependent uncertainty sets into a single unified set.<n>SAT is motivated by the challenge of integrating uncertainty sets when only the initial sets and their control levels are available.<n>Key theoretical contribution is a rigorous analysis of SAT's properties, including a proof of its admissibility in the context of deterministic set merging.
Score: 3.4136908117644693
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This article introduces a Synthetics, Aggregation, and Test inversion (SAT) approach for merging diverse and potentially dependent uncertainty sets into a single unified set. The procedure is data-light, relying only on initial sets and control levels, and it adapts to any user-specified initial uncertainty sets, accommodating potentially varying coverage levels. SAT is motivated by the challenge of integrating uncertainty sets when only the initial sets and their control levels are available - for example, when merging confidence sets from distributed sites under communication constraints or combining conformal prediction sets generated by different algorithms or data splits. To address this, SAT constructs and aggregates novel synthetic test statistics, and then derive merged sets through test inversion. Our method leverages the duality between set estimation and hypothesis testing, ensuring reliable coverage in dependent scenarios. A key theoretical contribution is a rigorous analysis of SAT's properties, including a proof of its admissibility in the context of deterministic set merging. Both theoretical analyses and empirical results confirm the method's finite-sample coverage validity and desirable set sizes.

Related papers

COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z)
A kernel conditional two-sample test [5.503626337185689]
We transform confidence bounds of a learning method into a conditional two-sample test.<n>We introduce bootstrapping schemes to avoid tuning inaccessible parameters.<n>Our results establish a comprehensive foundation for conditional two-sample testing.
arXiv Detail & Related papers (2025-06-04T12:53:13Z)
SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU) We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level. Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z)
A calibration test for evaluating set-based epistemic uncertainty representations [25.768233719182742]
We propose a novel statistical test to determine whether there is a convex combination of the set's predictions that is calibrated in distribution. In contrast to previous methods, our framework allows the convex combination to be instance dependent, recognizing that different ensemble members may be better calibrated in different regions of the input space.
arXiv Detail & Related papers (2025-02-22T17:10:45Z)
Credal Two-Sample Tests of Epistemic Uncertainty [34.42566984003255]
We introduce credal two-sample testing, a new hypothesis testing framework for comparing credal sets. By generalising two-sample test to compare credal sets, our framework enables reasoning for equality, inclusion, intersection, and mutual exclusivity.
arXiv Detail & Related papers (2024-10-16T18:09:09Z)
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z)
Federated Nonparametric Hypothesis Testing with Differential Privacy Constraints: Optimal Rates and Adaptive Tests [5.3595271893779906]
Federated learning has attracted significant recent attention due to its applicability across a wide range of settings where data is collected and analyzed across disparate locations. We study federated nonparametric goodness-of-fit testing in the white-noise-with-drift model under distributed differential privacy (DP) constraints.
arXiv Detail & Related papers (2024-06-10T19:25:19Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Conformal Language Modeling [61.94417935386489]
We propose a novel approach to conformal prediction for generative language models (LMs) Standard conformal prediction produces prediction sets with rigorous, statistical guarantees. We demonstrate the promise of our approach on multiple tasks in open-domain question answering, text summarization, and radiology report generation.
arXiv Detail & Related papers (2023-06-16T21:55:08Z)
Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing. We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z)
Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures. We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Model-Free Sequential Testing for Conditional Independence via Testing by Betting [8.293345261434943]
The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure. We allow the processing of data points online as soon as they arrive and stop data acquisition once significant results are detected.
arXiv Detail & Related papers (2022-10-01T20:05:33Z)
Conditional Independence Testing via Latent Representation Learning [2.566492438263125]
LCIT (Latent representation based Conditional Independence Test) is a novel non-parametric method for conditional independence testing based on representation learning. Our main contribution involves proposing a generative framework in which to test for the independence between X and Y given Z.
arXiv Detail & Related papers (2022-09-04T07:16:03Z)
Approximate Conditional Coverage via Neural Model Approximations [0.030458514384586396]
We analyze a data-driven procedure for obtaining empirically reliable approximate conditional coverage. We demonstrate the potential for substantial (and otherwise unknowable) under-coverage with split-conformal alternatives with marginal coverage guarantees.
arXiv Detail & Related papers (2022-05-28T02:59:05Z)
Calibrating Over-Parametrized Simulation Models: A Framework via Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees. We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z)
Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift. We propose two methods for automatically constructing ensembles with emphvarying architectures. We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z)
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator. We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors. Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.