Active Sampling of Multiple Sources for Sequential Estimation
- URL: http://arxiv.org/abs/2208.05406v1
- Date: Wed, 10 Aug 2022 15:58:05 GMT
- Title: Active Sampling of Multiple Sources for Sequential Estimation
- Authors: Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das
- Abstract summary: The objective is to design an active sampling algorithm for sequentially estimating parameters in order to form reliable estimates.
This paper adopts emph conditional estimation cost functions, leading to a sequential estimation approach that was recently shown to render tractable analysis.
- Score: 92.37271004438406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consider $K$ processes, each generating a sequence of identical and
independent random variables. The probability measures of these processes have
random parameters that must be estimated. Specifically, they share a parameter
$\theta$ common to all probability measures. Additionally, each process
$i\in\{1, \dots, K\}$ has a private parameter $\alpha_i$. The objective is to
design an active sampling algorithm for sequentially estimating these
parameters in order to form reliable estimates for all shared and private
parameters with the fewest number of samples. This sampling algorithm has three
key components: (i)~data-driven sampling decisions, which dynamically over time
specifies which of the $K$ processes should be selected for sampling;
(ii)~stopping time for the process, which specifies when the accumulated data
is sufficient to form reliable estimates and terminate the sampling process;
and (iii)~estimators for all shared and private parameters. Owing to the
sequential estimation being known to be analytically intractable, this paper
adopts \emph {conditional} estimation cost functions, leading to a sequential
estimation approach that was recently shown to render tractable analysis.
Asymptotically optimal decision rules (sampling, stopping, and estimation) are
delineated, and numerical experiments are provided to compare the efficacy and
quality of the proposed procedure with those of the relevant approaches.
Related papers
- Online Data Collection for Efficient Semiparametric Inference [41.49486724979923]
We present two online data collection policies, Explore-then-Commit and Explore-then-Greedy, that use the parameter estimates at a given time to optimally allocate the remaining budget in the future steps.
We prove that both policies achieve zero regret (assessed by MSE) relative to an oracle policy.
arXiv Detail & Related papers (2024-11-05T15:40:53Z) - Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting [56.92178753201331]
We propose the Observation-Aware Spectral (OAS) estimation technique, which enables the POMDP parameters to be learned from samples collected using a belief-based policy.
We show the consistency of the OAS procedure, and we prove a regret guarantee of order $mathcalO(sqrtT log(T)$ for the proposed OAS-UCRL algorithm.
arXiv Detail & Related papers (2024-10-02T08:46:34Z) - Non-Stochastic CDF Estimation Using Threshold Queries [3.6576781735746513]
We tackle the problem of estimating an empirical distribution in a setting with two challenges.
First, the algorithm does not directly observe the data; instead, it only asks a limited number of threshold queries about each sample.
Second, the data are not assumed to be independent and identically distributed; instead, we allow for an arbitrary process generating the samples.
arXiv Detail & Related papers (2023-01-13T18:00:57Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Machine Learning based optimization for interval uncertainty propagation
with application to vibro-acoustic models [0.0]
Two non-intrusive uncertainty propagation approaches are proposed for the performance analysis of engineering systems.
One approach builds iteratively two distinct training datasets for evaluating separately the upper and lower bounds of the response variable.
The other builds iteratively a single training dataset.
arXiv Detail & Related papers (2021-06-21T15:57:11Z) - Adaptive Sequential Design for a Single Time-Series [2.578242050187029]
We learn an optimal, unknown choice of the controlled components of a design in order to optimize the expected outcome.
We adapt the randomization mechanism for future time-point experiments based on the data collected on the individual over time.
arXiv Detail & Related papers (2021-01-29T22:51:45Z) - Noise-Contrastive Estimation for Multivariate Point Processes [28.23193933174945]
We show how to apply a noise-contrastive estimation method with a less expensive objective.
For the model to achieve the same level of log-likelihood on held-out data, our method needs considerably fewer function evaluations and less wall-clock time.
arXiv Detail & Related papers (2020-11-02T04:09:33Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - A Provably Efficient Sample Collection Strategy for Reinforcement
Learning [123.69175280309226]
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior.
We propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., sparse simulator of the environment); 2) An "objective-agnostic" sample collection responsible for generating the prescribed samples as fast as possible.
arXiv Detail & Related papers (2020-07-13T15:17:35Z) - Sampling from a $k$-DPP without looking at all items [58.30573872035083]
Given a kernel function and a subset size $k$, our goal is to sample $k$ out of $n$ items with probability proportional to the determinant of the kernel matrix induced by the subset (a.k.a. $k$-DPP)
Existing $k$-DPP sampling algorithms require an expensive preprocessing step which involves multiple passes over all $n$ items, making it infeasible for large datasets.
We develop an algorithm which adaptively builds a sufficiently large uniform sample of data that is then used to efficiently generate a smaller set of $k$ items.
arXiv Detail & Related papers (2020-06-30T16:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.