Related papers: On Efficient Online Imitation Learning via Classification

On Efficient Online Imitation Learning via Classification

URL: http://arxiv.org/abs/2209.12868v1
Date: Mon, 26 Sep 2022 17:34:36 GMT
Title: On Efficient Online Imitation Learning via Classification
Authors: Yichen Li, Chicheng Zhang
Abstract summary: We study classification-based online imitation learning (abbrev. $textbfCOIL$) and the fundamental feasibility to design oracle-efficient regret-minimization algorithms. Our work puts classification-based online imitation learning, an important IL setup, into a firmer foundation.
Score: 17.416831207557603
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imitation learning (IL) is a general learning paradigm for tackling sequential decision-making problems. Interactive imitation learning, where learners can interactively query for expert demonstrations, has been shown to achieve provably superior sample efficiency guarantees compared with its offline counterpart or reinforcement learning. In this work, we study classification-based online imitation learning (abbrev. $\textbf{COIL}$) and the fundamental feasibility to design oracle-efficient regret-minimization algorithms in this setting, with a focus on the general nonrealizable case. We make the following contributions: (1) we show that in the $\textbf{COIL}$ problem, any proper online learning algorithm cannot guarantee a sublinear regret in general; (2) we propose $\textbf{Logger}$, an improper online learning algorithmic framework, that reduces $\textbf{COIL}$ to online linear optimization, by utilizing a new definition of mixed policy class; (3) we design two oracle-efficient algorithms within the $\textbf{Logger}$ framework that enjoy different sample and interaction round complexity tradeoffs, and conduct finite-sample analyses to show their improvements over naive behavior cloning; (4) we show that under the standard complexity-theoretic assumptions, efficient dynamic regret minimization is infeasible in the $\textbf{Logger}$ framework. Our work puts classification-based online imitation learning, an important IL setup, into a firmer foundation.

Related papers

Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Learning for Spatial Branching: An Algorithm Selection Approach [0.0]
We develop a learning framework for branching and show its efficacy in the context of non-linear optimization problems. The proposed learning is performed offline, based on instance-specific features and with no computational overhead when solving new instances. Experiments on different benchmark instances from the show literature that the learning-based branching rule significantly outperforms the standard rules.
arXiv Detail & Related papers (2022-04-22T17:23:43Z)
PDE-Based Optimal Strategy for Unconstrained Online Learning [40.61498562988079]
We present a framework that generates time-varying potential functions by solving a Partial Differential Equation (PDE) Our framework recovers some classical potentials, and more importantly provides a systematic approach to design new ones. This is the first parameter-free algorithm with optimal leading constant.
arXiv Detail & Related papers (2022-01-19T22:21:21Z)
Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances. Online descent (OGD) is a popular approach to handle streaming data in pairwise learning. In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification. Our strategy enables important aspects of the base learner objective to be learned during meta-training. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z)
Online Passive-Aggressive Total-Error-Rate Minimization [1.370633147306388]
We provide a new online learning algorithm which utilizes online passive-aggressive learning (PA) and total-error-rate minimization (TER) for binary classification. Experimental results demonstrate that the proposed PATER algorithms achieves better performances in terms of efficiency and effectiveness than the existing state-of-the-art online learning algorithms in real-world data sets.
arXiv Detail & Related papers (2020-02-05T13:10:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.