Towards a Law of Iterated Expectations for Heuristic Estimators
- URL: http://arxiv.org/abs/2410.01290v1
- Date: Wed, 2 Oct 2024 07:33:27 GMT
- Title: Towards a Law of Iterated Expectations for Heuristic Estimators
- Authors: Paul Christiano, Jacob Hilton, Andrea Lincoln, Eric Neyman, Mark Xu,
- Abstract summary: In this work, we argue for the informal principle that a estimator ought not to be able to predict its own errors.
We argue that an ideal estimator ought to satisfy two stronger properties in this vein.
As an alternative approach, we explore *accuracy*: a property that states that $mathbbG(Y mid pi)$ has zero average error over a distribution of mathematical expressions.
- Score: 10.287238241371922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Christiano et al. (2022) define a *heuristic estimator* to be a hypothetical algorithm that estimates the values of mathematical expressions from arguments. In brief, a heuristic estimator $\mathbb{G}$ takes as input a mathematical expression $Y$ and a formal "heuristic argument" $\pi$, and outputs an estimate $\mathbb{G}(Y \mid \pi)$ of $Y$. In this work, we argue for the informal principle that a heuristic estimator ought not to be able to predict its own errors, and we explore approaches to formalizing this principle. Most simply, the principle suggests that $\mathbb{G}(Y - \mathbb{G}(Y \mid \pi) \mid \pi)$ ought to equal zero for all $Y$ and $\pi$. We argue that an ideal heuristic estimator ought to satisfy two stronger properties in this vein, which we term *iterated estimation* (by analogy to the law of iterated expectations) and *error orthogonality*. Although iterated estimation and error orthogonality are intuitively appealing, it can be difficult to determine whether a given heuristic estimator satisfies the properties. As an alternative approach, we explore *accuracy*: a property that (roughly) states that $\mathbb{G}$ has zero average error over a distribution of mathematical expressions. However, in the context of two estimation problems, we demonstrate barriers to creating an accurate heuristic estimator. We finish by discussing challenges and potential paths forward for finding a heuristic estimator that accords with our intuitive understanding of how such an estimator ought to behave, as well as the potential applications of heuristic estimators to understanding the behavior of neural networks.
Related papers
- A Theory of Interpretable Approximations [61.90216959710842]
We study the idea of approximating a target concept $c$ by a small aggregation of concepts from some base class $mathcalH$.
For any given pair of $mathcalH$ and $c$, exactly one of these cases holds: (i) $c$ cannot be approximated by $mathcalH$ with arbitrary accuracy.
We show that, in the case of interpretable approximations, even a slightly nontrivial a-priori guarantee on the complexity of approximations implies approximations with constant (distribution-free and accuracy-
arXiv Detail & Related papers (2024-06-15T06:43:45Z) - Robust Sparse Mean Estimation via Incremental Learning [15.536082641659423]
In this paper, we study the problem of robust mean estimation, where the goal is to estimate a $k$-sparse mean from a collection of partially corrupted samples.
We present a simple mean estimator that overcomes both challenges under moderate conditions.
Our method does not need any prior knowledge of the sparsity level $k$.
arXiv Detail & Related papers (2023-05-24T16:02:28Z) - Formalizing the presumption of independence [2.658812114255374]
A key ingredient in such reasoning is the use of a "default" estimate of $mathbbE[XY] = mathbbE[X] mathbbE[Y]$.
Reasoning based on this is commonplace, intuitively compelling, and often quite successful -- but completely informal.
We present our main open problem: is there an estimator that formalizes intuitively valid applications of the presumption of independence without also accepting spurious arguments?
arXiv Detail & Related papers (2022-11-12T20:28:19Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Robust Estimation for Random Graphs [47.07886511972046]
We study the problem of robustly estimating the parameter $p$ of an ErdHos-R'enyi random graph on $n$ nodes.
We give an inefficient algorithm with similar accuracy for all $gamma 1/2$, the information-theoretic limit.
arXiv Detail & Related papers (2021-11-09T18:43:25Z) - Probabilistic Iterative Methods for Linear Systems [5.457842083043013]
We show that a standard iterative method on $mathbbRd$ is lifted to act on the space of probability distributions $mathcalP(mathbbRd)$.
The output of the iterative methods proposed in this paper is, instead, a sequence of probability distributions $mu_m in mathcalP(mathbbRd)$.
arXiv Detail & Related papers (2020-12-23T11:55:57Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$ [5.457150493905064]
We present a novel estimator with sub-Gaussian convergence.
Our estimator does not require prior knowledge of the variance.
Our estimator construction and analysis gives a framework generalizable to other problems.
arXiv Detail & Related papers (2020-11-17T02:47:24Z) - Learning Minimax Estimators via Online Learning [55.92459567732491]
We consider the problem of designing minimax estimators for estimating parameters of a probability distribution.
We construct an algorithm for finding a mixed-case Nash equilibrium.
arXiv Detail & Related papers (2020-06-19T22:49:42Z) - GO Hessian for Expectation-Based Objectives [73.06986780804269]
GO gradient was proposed recently for expectation-based objectives $mathbbE_q_boldsymbolboldsymbolgamma(boldsymboly) [f(boldsymboly)]$.
Based on the GO gradient, we present for $mathbbE_q_boldsymbolboldsymbolgamma(boldsymboly) [f(boldsymboly)]$ an unbiased low-variance Hessian estimator, named GO Hessian.
arXiv Detail & Related papers (2020-06-16T02:20:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.