Related papers: Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference

Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference

URL: http://arxiv.org/abs/2008.08812v2
Date: Fri, 21 Aug 2020 01:14:04 GMT
Title: Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference
Authors: Liting Sun, Zheng Wu, Hengbo Ma, Masayoshi Tomizuka
Abstract summary: Cost/reward learning is an efficient way to learn and represent human behavior. In this paper, we propose a probabilistic IRL framework that directly learns a distribution of cost functions in continuous domain.
Score: 34.05002276323983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In human-robot interaction (HRI) systems, such as autonomous vehicles, understanding and representing human behavior are important. Human behavior is naturally rich and diverse. Cost/reward learning, as an efficient way to learn and represent human behavior, has been successfully applied in many domains. Most of traditional inverse reinforcement learning (IRL) algorithms, however, cannot adequately capture the diversity of human behavior since they assume that all behavior in a given dataset is generated by a single cost function.In this paper, we propose a probabilistic IRL framework that directly learns a distribution of cost functions in continuous domain. Evaluations on both synthetic data and real human driving data are conducted. Both the quantitative and subjective results show that our proposed framework can better express diverse human driving behaviors, as well as extracting different driving styles that match what human participants interpret in our user study.

Related papers

Is Diversity All You Need for Scalable Robotic Manipulation? [50.747150672933316]
We investigate the nuanced role of data diversity in robot learning by examining three critical dimensions-task (what to do), embodiment (which robot to use), and expert (who demonstrates)-challenging the conventional intuition of "more diverse is better"<n>We show that task diversity proves more critical than per-task demonstration quantity, benefiting transfer from diverse pre-training tasks to novel downstream scenarios.<n>We propose a distribution debiasing method to mitigate velocity ambiguity, the yielding GO-1-Pro achieves substantial performance gains of 15%, equivalent to using 2.5 times pre-training data.
arXiv Detail & Related papers (2025-07-08T17:52:44Z)
Discrete Contrastive Learning for Diffusion Policies in Autonomous Driving [18.624545462468642]
We propose a novel approach that leverages contrastive learning to extract a dictionary of driving styles from pre-existing human driving data. Our empirical evaluation confirms that the behaviors generated by our approach are both safer and more human-like than those of the machine-learning-based baseline methods.
arXiv Detail & Related papers (2025-03-07T08:26:04Z)
Learning from Active Human Involvement through Proxy Value Propagation [44.144964115275]
Learning from active human involvement enables the human subject to actively intervene and demonstrate to the AI agent during training. We propose a new reward-free active human involvement method called Proxy Value propagation for policy optimization. Our method can learn to solve continuous and discrete control tasks with various human control devices, including the challenging task of driving in Grand Theft Auto V.
arXiv Detail & Related papers (2025-02-05T17:07:37Z)
Real-time Addressee Estimation: Deployment of a Deep-Learning Model on the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans. Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot. The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z)
SACSoN: Scalable Autonomous Control for Social Navigation [62.59274275261392]
We develop methods for training policies for socially unobtrusive navigation. By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. We collect a large dataset where an indoor mobile robot interacts with human bystanders.
arXiv Detail & Related papers (2023-06-02T19:07:52Z)
Learning to Influence Human Behavior with Offline Reinforcement Learning [70.7884839812069]
We focus on influence in settings where there is a need to capture human suboptimality. Experiments online with humans is potentially unsafe, and creating a high-fidelity simulator of the environment is often impractical. We show that offline reinforcement learning can learn to effectively influence suboptimal humans by extending and combining elements of observed human-human behavior.
arXiv Detail & Related papers (2023-03-03T23:41:55Z)
Learning Preferences for Interactive Autonomy [1.90365714903665]
This thesis is an attempt towards learning reward functions from human users by using other, more reliable data modalities. We first propose various forms of comparative feedback, e.g., pairwise comparisons, best-of-many choices, rankings, scaled comparisons; and describe how a robot can use these various forms of human feedback to infer a reward function.
arXiv Detail & Related papers (2022-10-19T21:34:51Z)
Learning from humans: combining imitation and deep reinforcement learning to accomplish human-level performance on a virtual foraging task [6.263481844384228]
We develop a method to learn bio-inspired foraging policies using human data. We conduct an experiment where humans are virtually immersed in an open field foraging environment and are trained to collect the highest amount of rewards.
arXiv Detail & Related papers (2022-03-11T20:52:30Z)
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation. Our study analyzes the most critical challenges when learning from offline human data. We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z)
Drivers' Manoeuvre Modelling and Prediction for Safe HRI [0.0]
Theory of Mind has been broadly explored for robotics and recently for autonomous and semi-autonomous vehicles. We explored how to predict human intentions before an action is performed by combining data from human-motion, vehicle-state and human inputs.
arXiv Detail & Related papers (2021-06-03T10:07:55Z)
Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data [18.750834997334664]
We argue that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process. We propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning.
arXiv Detail & Related papers (2021-03-07T07:48:31Z)
Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions. We propose two knowledge-based data-driven methods to effectively capture these social interactions. We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.