Approximation Algorithms for Active Sequential Hypothesis Testing
- URL: http://arxiv.org/abs/2103.04250v1
- Date: Sun, 7 Mar 2021 03:49:19 GMT
- Title: Approximation Algorithms for Active Sequential Hypothesis Testing
- Authors: Kyra Gan, Su Jia, Andrew Li
- Abstract summary: This paper provides the first approximation algorithms for active sequential hypotheses testing.
We numerically investigate the performance of our algorithms using both synthetic and real data, showing that our algorithms outperform a previously proposed policy.
- Score: 3.840659854046464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the problem of active sequential hypotheses testing (ASHT), a learner
seeks to identify the true hypothesis $h^*$ from among a set of hypotheses $H$.
The learner is given a set of actions and knows the outcome distribution of any
action under any true hypothesis. While repeatedly playing the entire set of
actions suffices to identify $h^*$, a cost is incurred with each action. Thus,
given a target error $\delta>0$, the goal is to find the minimal cost policy
for sequentially selecting actions that identify $h^*$ with probability at
least $1 - \delta$.
This paper provides the first approximation algorithms for ASHT, under two
types of adaptivity. First, a policy is partially adaptive if it fixes a
sequence of actions in advance and adaptively decides when to terminate and
what hypothesis to return. Under partial adaptivity, we provide an
$O\big(s^{-1}(1+\log_{1/\delta}|H|)\log (s^{-1}|H| \log
|H|)\big)$-approximation algorithm, where $s$ is a natural separation parameter
between the hypotheses. Second, a policy is fully adaptive if action selection
is allowed to depend on previous outcomes. Under full adaptivity, we provide an
$O(s^{-1}\log (|H|/\delta)\log |H|)$-approximation algorithm. We numerically
investigate the performance of our algorithms using both synthetic and
real-world data, showing that our algorithms outperform a previously proposed
heuristic policy.
Related papers
- A Competitive Algorithm for Agnostic Active Learning [5.4579367210379335]
Most popular algorithms for active learning express their performance in terms of a parameter called the disagreement coefficient.
We get an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$.
It is NP-hard to do better than our algorithm's $O(log |H|)$ overhead in general.
arXiv Detail & Related papers (2023-10-28T19:01:16Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Minimum Cost Adaptive Submodular Cover [4.680686256929023]
We consider the problem of minimum cost cover of adaptive-submodular functions.
We show that our algorithm simultaneously achieves a $(p+1)p+1cdot (ln Q+1)p$ approximation guarantee for all $pge 1$.
arXiv Detail & Related papers (2022-08-17T15:26:47Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Nearly Optimal Policy Optimization with Stable at Any Time Guarantee [53.155554415415445]
Policy-based method in citetshani 2020optimistic is only $tildeO(sqrtSAH3K + sqrtAH4K)$ where $S$ is the number of states, $A$ is the number of actions, $H$ is the horizon, and $K$ is the number of episodes, and there is a $sqrtSH$ gap compared with the information theoretic lower bound $tildeOmega(sqrtSAH
arXiv Detail & Related papers (2021-12-21T01:54:17Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Improved Sample Complexity for Incremental Autonomous Exploration in
MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps.
DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z) - Accelerating Optimization and Reinforcement Learning with
Quasi-Stochastic Approximation [2.294014185517203]
This paper sets out to extend convergence theory to quasi-stochastic approximations.
It is illustrated with applications to gradient-free optimization and policy gradient algorithms for reinforcement learning.
arXiv Detail & Related papers (2020-09-30T04:44:45Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.