Active Imitation Learning with Noisy Guidance
- URL: http://arxiv.org/abs/2005.12801v1
- Date: Tue, 26 May 2020 15:35:46 GMT
- Title: Active Imitation Learning with Noisy Guidance
- Authors: Kiant\'e Brantley, Amr Sharaf, Hal Daum\'e III
- Abstract summary: Imitation learning algorithms provide state-of-the-art results on many structured prediction tasks.
Such algorithms assume training-time access to an expert that can provide the optimal action at any queried state.
We consider an active learning setting in which the learning algorithm has additional access to a much cheaper noisy that provides noisy guidance.
- Score: 6.832341432995627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imitation learning algorithms provide state-of-the-art results on many
structured prediction tasks by learning near-optimal search policies. Such
algorithms assume training-time access to an expert that can provide the
optimal action at any queried state; unfortunately, the number of such queries
is often prohibitive, frequently rendering these approaches impractical. To
combat this query complexity, we consider an active learning setting in which
the learning algorithm has additional access to a much cheaper noisy heuristic
that provides noisy guidance. Our algorithm, LEAQI, learns a difference
classifier that predicts when the expert is likely to disagree with the
heuristic, and queries the expert only when necessary. We apply LEAQI to three
sequence labeling tasks, demonstrating significantly fewer queries to the
expert and comparable (or better) accuracies over a passive approach.
Related papers
- Selective Sampling and Imitation Learning via Online Regression [17.73844193143454]
We consider the problem of Imitation Learning (IL) by actively querying noisy expert for feedback.
We provide an interactive algorithm for IL that uses selective sampling to actively query the noisy expert for feedback.
arXiv Detail & Related papers (2023-07-11T03:32:20Z) - Interpretable Anomaly Detection via Discrete Optimization [1.7150329136228712]
We propose a framework for learning inherently interpretable anomaly detectors from sequential data.
We show that this problem is computationally hard and develop two learning algorithms based on constraint optimization.
Using a prototype implementation, we demonstrate that our approach shows promising results in terms of accuracy and F1 score.
arXiv Detail & Related papers (2023-03-24T16:19:15Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z) - Learning Halfspaces With Membership Queries [0.0]
In some cases, active learning can yield an exponential gain in the number of samples the algorithm needs to see.
We show that the algorithm works well in practice, and significantly outperforms uncertainty sampling.
arXiv Detail & Related papers (2020-12-20T18:02:47Z) - Learning to Actively Learn: A Robust Approach [22.75298609290053]
This work proposes a procedure for designing algorithms for adaptive data collection tasks like active learning and pure-exploration multi-armed bandits.
Our adaptive algorithm is learned via adversarial training over equivalence classes of problems derived from information theoretic lower bounds.
We perform synthetic experiments to justify the stability and effectiveness of the training procedure, and then evaluate the method on tasks derived from real data.
arXiv Detail & Related papers (2020-10-29T06:48:22Z) - Instance-Dependent Complexity of Contextual Bandits and Reinforcement
Learning: A Disagreement-Based Perspective [104.67295710363679]
In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm.
We introduce a family of complexity measures that are both sufficient and necessary to obtain instance-dependent regret bounds.
We then introduce new oracle-efficient algorithms which adapt to the gap whenever possible, while also attaining the minimax rate in the worst case.
arXiv Detail & Related papers (2020-10-07T01:33:06Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z) - Beyond Worst-Case Analysis in Stochastic Approximation: Moment
Estimation Improves Instance Complexity [58.70807593332932]
We study oracle complexity of gradient based methods for approximation problems.
We focus on instance-dependent complexity instead of worst case complexity.
Our proposed algorithm and its analysis provide a theoretical justification for the success of moment estimation.
arXiv Detail & Related papers (2020-06-08T09:25:47Z) - Cyclic Boosting -- an explainable supervised machine learning algorithm [0.0]
We propose the novel "Cyclic Boosting" machine learning algorithm.
It allows to efficiently perform accurate regression and classification tasks while at the same time allowing a detailed understanding of how each individual prediction was made.
arXiv Detail & Related papers (2020-02-09T18:52:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.