Risk-limiting Financial Audits via Weighted Sampling without Replacement
- URL: http://arxiv.org/abs/2305.06884v1
- Date: Mon, 8 May 2023 17:34:06 GMT
- Title: Risk-limiting Financial Audits via Weighted Sampling without Replacement
- Authors: Shubhanshu Shekhar, Ziyu Xu, Zachary C. Lipton, Pierre J. Liang,
Aaditya Ramdas
- Abstract summary: Given $N$ transactions, the goal is to estimate the total misstated monetary fraction to a given accuracy.
We do this by constructing new confidence sequences for the weighted average of $N$ unknown values.
We show that when the side information is sufficiently predictive, it can directly drive the sampling.
- Score: 47.189919138260066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce the notion of a risk-limiting financial auditing (RLFA): given
$N$ transactions, the goal is to estimate the total misstated monetary
fraction~($m^*$) to a given accuracy $\epsilon$, with confidence $1-\delta$. We
do this by constructing new confidence sequences (CSs) for the weighted average
of $N$ unknown values, based on samples drawn without replacement according to
a (randomized) weighted sampling scheme. Using the idea of importance weighting
to construct test martingales, we first develop a framework to construct CSs
for arbitrary sampling strategies. Next, we develop methods to improve the
quality of CSs by incorporating side information about the unknown values
associated with each item. We show that when the side information is
sufficiently predictive, it can directly drive the sampling. Addressing the
case where the accuracy is unknown a priori, we introduce a method that
incorporates side information via control variates. Crucially, our construction
is adaptive: if the side information is highly predictive of the unknown
misstated amounts, then the benefits of incorporating it are significant; but
if the side information is uncorrelated, our methods learn to ignore it. Our
methods recover state-of-the-art bounds for the special case when the weights
are equal, which has already found applications in election auditing. The
harder weighted case solves our more challenging problem of AI-assisted
financial auditing.
Related papers
- Is Offline Decision Making Possible with Only Few Samples? Reliable
Decisions in Data-Starved Bandits via Trust Region Enhancement [25.68354404229254]
We show that even in a data-starved setting it may still be possible to find a policy competitive with the optimal one.
This paves the way to reliable decision-making in settings where critical decisions must be made by relying only on a handful of samples.
arXiv Detail & Related papers (2024-02-24T03:41:09Z) - Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization [15.275864909088577]
Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making.
We propose a novel confidence set that is semi-adaptive' to the unknown sub-Gaussian parameter $sigma_*2$.
For bounded rewards, we propose a novel variance-adaptive confidence set that has much improved numerical performance upon prior art.
arXiv Detail & Related papers (2024-02-12T00:19:09Z) - Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear
Contextual Bandits and Markov Decision Processes [59.61248760134937]
We propose an efficient algorithm to achieve a regret of $tildeO(sqrtT+zeta)$.
The proposed algorithm relies on the recently developed uncertainty-weighted least-squares regression from linear contextual bandit.
We generalize our algorithm to the episodic MDP setting and first achieve an additive dependence on the corruption level $zeta$.
arXiv Detail & Related papers (2022-12-12T15:04:56Z) - CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
In this work, we propose a novel sampling-free uncertainty estimation method for object detection.
We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z) - Confidence-Budget Matching for Sequential Budgeted Learning [69.77435313099366]
We formalize decision-making problems with querying budget.
We consider multi-armed bandits, linear bandits, and reinforcement learning problems.
We show that CBM based algorithms perform well in the presence of adversity.
arXiv Detail & Related papers (2021-02-05T19:56:31Z) - Uncertainty quantification using martingales for misspecified Gaussian
processes [52.22233158357913]
We address uncertainty quantification for Gaussian processes (GPs) under misspecified priors.
We construct a confidence sequence (CS) for the unknown function using martingale techniques.
Our CS is statistically valid and empirically outperforms standard GP methods.
arXiv Detail & Related papers (2020-06-12T17:58:59Z) - Confidence sequences for sampling without replacement [39.98103047898974]
We present a suite of tools for designing confidence sequences for $thetastar$.
A CS is a sequence of confidence sets $(C_n)_n=1N$, that shrink in size, and all contain $thetastar$ simultaneously with high probability.
We then present Hoeffding- and empirical-Bernstein-type time-uniform CSs and fixed-time confidence intervals for sampling WoR.
arXiv Detail & Related papers (2020-06-08T04:30:25Z) - Debiased Off-Policy Evaluation for Recommendation Systems [8.63711086812655]
A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure.
We develop an alternative method, which predicts the performance of algorithms given historical data.
Our method produces smaller mean squared errors than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-20T02:30:02Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.