Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach
- URL: http://arxiv.org/abs/2411.08232v1
- Date: Tue, 12 Nov 2024 22:56:28 GMT
- Title: Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach
- Authors: Renzi Wang, Flavia Sofia Acerbo, Tong Duy Son, Panagiotis Patrinos,
- Abstract summary: This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy.
The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations.
- Score: 2.4427666827706074
- License:
- Abstract: This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The parameters of the model are learned via a two-stage framework. By leveraging the existing dynamics knowledge, the first stage of the framework estimates the control input sequences and hence reduces the problem complexity. At the second stage, the policy is learned by solving a regularized maximum-likelihood estimation problem using the estimated control input sequences. We further extend the learning procedure by incorporating a Lyapunov stability constraint to ensure asymptotic stability of the identified model, for accurate multi-step predictions. The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations, demonstrating its practical applicability in modelling complex nonlinear dynamics.
Related papers
- Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning [17.49812746668935]
We propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning.
We develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics.
arXiv Detail & Related papers (2024-12-08T02:28:58Z) - The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning [31.8260779160424]
We investigate how popular algorithms perform as the learned dynamics model is improved.
We propose Reach-Aware Learning (RAVL), a simple and robust method that directly addresses the edge-of-reach problem.
arXiv Detail & Related papers (2024-02-19T20:38:00Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Model Predictive Control with Gaussian-Process-Supported Dynamical
Constraints for Autonomous Vehicles [82.65261980827594]
We propose a model predictive control approach for autonomous vehicles that exploits learned Gaussian processes for predicting human driving behavior.
A multi-mode predictive control approach considers the possible intentions of the human drivers.
arXiv Detail & Related papers (2023-03-08T17:14:57Z) - Plan To Predict: Learning an Uncertainty-Foreseeing Model for
Model-Based Reinforcement Learning [32.24146877835396]
We propose emphPlan To Predict (P2P), a framework that treats the model rollout process as a sequential decision making problem.
We show that P2P achieves state-of-the-art performance on several challenging benchmark tasks.
arXiv Detail & Related papers (2023-01-20T10:17:22Z) - Time varying regression with hidden linear dynamics [74.9914602730208]
We revisit a model for time-varying linear regression that assumes the unknown parameters evolve according to a linear dynamical system.
Counterintuitively, we show that when the underlying dynamics are stable the parameters of this model can be estimated from data by combining just two ordinary least squares estimates.
arXiv Detail & Related papers (2021-12-29T23:37:06Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Safe Continuous Control with Constrained Model-Based Policy Optimization [0.0]
We introduce a model-based safe exploration algorithm for constrained high-dimensional control.
We also introduce a practical algorithm that accelerates policy search with model-generated data.
arXiv Detail & Related papers (2021-04-14T15:20:55Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.