Related papers: Explainable robotic systems: Understanding goal-driven actions in a reinforcement learning scenario

Explainable robotic systems: Understanding goal-driven actions in a reinforcement learning scenario

URL: http://arxiv.org/abs/2006.13615v3
Date: Thu, 2 Sep 2021 07:17:22 GMT
Title: Explainable robotic systems: Understanding goal-driven actions in a reinforcement learning scenario
Authors: Francisco Cruz and Richard Dazeley and Peter Vamplew and Ithan Moreira
Abstract summary: In reinforcement learning scenarios, a great effort has been focused on providing explanations using data-driven approaches. In this work, we focus rather on the decision-making process of reinforcement learning agents performing a task in a robotic scenario. We use the probability of success computed by three different proposed approaches: memory-based, learning-based, and introspection-based.
Score: 1.671353192305391
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robotic systems are more present in our society everyday. In human-robot environments, it is crucial that end-users may correctly understand their robotic team-partners, in order to collaboratively complete a task. To increase action understanding, users demand more explainability about the decisions by the robot in particular situations. Recently, explainable robotic systems have emerged as an alternative focused not only on completing a task satisfactorily, but also on justifying, in a human-like manner, the reasons that lead to making a decision. In reinforcement learning scenarios, a great effort has been focused on providing explanations using data-driven approaches, particularly from the visual input modality in deep learning-based systems. In this work, we focus rather on the decision-making process of reinforcement learning agents performing a task in a robotic scenario. Experimental results are obtained using 3 different set-ups, namely, a deterministic navigation task, a stochastic navigation task, and a continuous visual-based sorting object task. As a way to explain the goal-driven robot's actions, we use the probability of success computed by three different proposed approaches: memory-based, learning-based, and introspection-based. The difference between these approaches is the amount of memory required to compute or estimate the probability of success as well as the kind of reinforcement learning representation where they could be used. In this regard, we use the memory-based approach as a baseline since it is obtained directly from the agent's observations. When comparing the learning-based and the introspection-based approaches to this baseline, both are found to be suitable alternatives to compute the probability of success, obtaining high levels of similarity when compared using both the Pearson's correlation and the mean squared error.

Related papers

Inductive Learning of Robot Task Knowledge from Raw Data and Online Expert Feedback [3.10979520014442]
An increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. We propose an offline algorithm based on inductive logic programming from noisy examples to extract task specifications.
arXiv Detail & Related papers (2025-01-13T17:25:46Z)
Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills. We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval. GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z)
On-Robot Bayesian Reinforcement Learning for POMDPs [16.667924736270415]
This paper advances Bayesian reinforcement learning for robotics by proposing a specialized framework for physical systems. We capture this knowledge in a factored representation, then demonstrate the posterior factorizes in a similar shape, and ultimately formalize the model in a Bayesian framework. We then introduce a sample-based online solution method, based on Monte-Carlo tree search and particle filtering, specialized to solve the resulting model.
arXiv Detail & Related papers (2023-07-22T01:16:29Z)
Explaining Agent's Decision-making in a Hierarchical Reinforcement Learning Scenario [0.6643086804649938]
Reinforcement learning is a machine learning approach based on behavioral psychology. In this work, we make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks.
arXiv Detail & Related papers (2022-12-14T01:18:45Z)
Discovering Unsupervised Behaviours from Full-State Trajectories [1.827510863075184]
We propose an analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations. We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories. More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.
arXiv Detail & Related papers (2022-11-22T16:57:52Z)
Interpreting Neural Policies with Disentangled Tree Representations [58.769048492254555]
We study interpretability of compact neural policies through the lens of disentangled representation. We leverage decision trees to obtain factors of variation for disentanglement in robot learning. We introduce interpretability metrics that measure disentanglement of learned neural dynamics.
arXiv Detail & Related papers (2022-10-13T01:10:41Z)
Autonomous Open-Ended Learning of Tasks with Non-Stationary Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals. While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks. In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture. Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z)
Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms. We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z)
Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap. We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z)
Probabilistic Active Meta-Learning [15.432006404678981]
We introduce task selection based on prior experience into a meta-learning algorithm. We provide empirical evidence that our approach improves data-efficiency when compared to strong baselines on simulated robotic experiments.
arXiv Detail & Related papers (2020-07-17T12:51:42Z)
Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection. We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted. In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.