Related papers: Risk-limiting Financial Audits via Weighted Sampling without Replacement

Risk-limiting Financial Audits via Weighted Sampling without Replacement

URL: http://arxiv.org/abs/2305.06884v1
Date: Mon, 8 May 2023 17:34:06 GMT
Title: Risk-limiting Financial Audits via Weighted Sampling without Replacement
Authors: Shubhanshu Shekhar, Ziyu Xu, Zachary C. Lipton, Pierre J. Liang, Aaditya Ramdas
Abstract summary: Given $N$ transactions, the goal is to estimate the total misstated monetary fraction to a given accuracy. We do this by constructing new confidence sequences for the weighted average of $N$ unknown values. We show that when the side information is sufficiently predictive, it can directly drive the sampling.
Score: 47.189919138260066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce the notion of a risk-limiting financial auditing (RLFA): given $N$ transactions, the goal is to estimate the total misstated monetary fraction~($m^*$) to a given accuracy $\epsilon$, with confidence $1-\delta$. We do this by constructing new confidence sequences (CSs) for the weighted average of $N$ unknown values, based on samples drawn without replacement according to a (randomized) weighted sampling scheme. Using the idea of importance weighting to construct test martingales, we first develop a framework to construct CSs for arbitrary sampling strategies. Next, we develop methods to improve the quality of CSs by incorporating side information about the unknown values associated with each item. We show that when the side information is sufficiently predictive, it can directly drive the sampling. Addressing the case where the accuracy is unknown a priori, we introduce a method that incorporates side information via control variates. Crucially, our construction is adaptive: if the side information is highly predictive of the unknown misstated amounts, then the benefits of incorporating it are significant; but if the side information is uncorrelated, our methods learn to ignore it. Our methods recover state-of-the-art bounds for the special case when the weights are equal, which has already found applications in election auditing. The harder weighted case solves our more challenging problem of AI-assisted financial auditing.

Related papers

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement [25.68354404229254]
We show that even in a data-starved setting it may still be possible to find a policy competitive with the optimal one. This paves the way to reliable decision-making in settings where critical decisions must be made by relying only on a handful of samples.
arXiv Detail & Related papers (2024-02-24T03:41:09Z)
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization [15.275864909088577]
Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making. We propose a novel confidence set that is semi-adaptive' to the unknown sub-Gaussian parameter $sigma_*2$. For bounded rewards, we propose a novel variance-adaptive confidence set that has much improved numerical performance upon prior art.
arXiv Detail & Related papers (2024-02-12T00:19:09Z)
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes [59.61248760134937]
We propose an efficient algorithm to achieve a regret of $tildeO(sqrtT+zeta)$. The proposed algorithm relies on the recently developed uncertainty-weighted least-squares regression from linear contextual bandit. We generalize our algorithm to the episodic MDP setting and first achieve an additive dependence on the corruption level $zeta$.
arXiv Detail & Related papers (2022-12-12T15:04:56Z)
CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings. In this work, we propose a novel sampling-free uncertainty estimation method for object detection. We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z)
Confidence-Budget Matching for Sequential Budgeted Learning [69.77435313099366]
We formalize decision-making problems with querying budget. We consider multi-armed bandits, linear bandits, and reinforcement learning problems. We show that CBM based algorithms perform well in the presence of adversity.
arXiv Detail & Related papers (2021-02-05T19:56:31Z)
Uncertainty quantification using martingales for misspecified Gaussian processes [52.22233158357913]
We address uncertainty quantification for Gaussian processes (GPs) under misspecified priors. We construct a confidence sequence (CS) for the unknown function using martingale techniques. Our CS is statistically valid and empirically outperforms standard GP methods.
arXiv Detail & Related papers (2020-06-12T17:58:59Z)
Confidence sequences for sampling without replacement [39.98103047898974]
We present a suite of tools for designing confidence sequences for $thetastar$. A CS is a sequence of confidence sets $(C_n)_n=1N$, that shrink in size, and all contain $thetastar$ simultaneously with high probability. We then present Hoeffding- and empirical-Bernstein-type time-uniform CSs and fixed-time confidence intervals for sampling WoR.
arXiv Detail & Related papers (2020-06-08T04:30:25Z)
Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. We develop an alternative method, which predicts the performance of algorithms given historical data. Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z)
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator. We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors. Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.