Related papers: Sample-efficient Reinforcement Learning in Robotic Table Tennis

Sample-efficient Reinforcement Learning in Robotic Table Tennis

URL: http://arxiv.org/abs/2011.03275v4
Date: Thu, 4 Jan 2024 10:25:03 GMT
Title: Sample-efficient Reinforcement Learning in Robotic Table Tennis
Authors: Jonas Tebbe, Lukas Krauch, Yapeng Gao, Andreas Zell
Abstract summary: Reinforcement learning (RL) has achieved some impressive recent successes in various computer games and simulations. We present a sample-efficient RL algorithm applied to the example of a table tennis robot. Our approach performs competitively both in a simulation and on the real robot in a number of challenging scenarios.
Score: 14.552652489374761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) has achieved some impressive recent successes in various computer games and simulations. Most of these successes are based on having large numbers of episodes from which the agent can learn. In typical robotic applications, however, the number of feasible attempts is very limited. In this paper we present a sample-efficient RL algorithm applied to the example of a table tennis robot. In table tennis every stroke is different, with varying placement, speed and spin. An accurate return therefore has to be found depending on a high-dimensional continuous state space. To make learning in few trials possible the method is embedded into our robot system. In this way we can use a one-step environment. The state space depends on the ball at hitting time (position, velocity, spin) and the action is the racket state (orientation, velocity) at hitting. An actor-critic based deterministic policy gradient algorithm was developed for accelerated learning. Our approach performs competitively both in a simulation and on the real robot in a number of challenging scenarios. Accurate results are obtained without pre-training in under $200$ episodes of training. The video presenting our experiments is available at https://youtu.be/uRAtdoL6Wpw.

Related papers

Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce Coarse-to-fine Q-Network with Action Sequence (CQN-AS), a novel value-based reinforcement learning algorithm. We study our algorithm on 53 robotic tasks with sparse and dense rewards, as well as with and without demonstrations.
arXiv Detail & Related papers (2024-11-19T01:23:52Z)
Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient Guidance [0.3613661942047476]
We develop a novel diffusion modeling approach that is offline, constraint-guided, and expressive of diverse agile behaviors. We demonstrate the effectiveness of our approach for time-critical robotic tasks by evaluating KCGG in two challenging domains: simulated air hockey and real table tennis.
arXiv Detail & Related papers (2024-09-23T20:26:51Z)
Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills. We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval. GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z)
IRASim: Learning Interactive Real-Robot Action Simulators [24.591694756757278]
We introduce a novel method, IRASim, to generate realistic videos of a robot arm that executes a given action trajectory. To validate the effectiveness of our method, we create a new benchmark, IRASim Benchmark, based on three real-robot datasets. Results show that IRASim outperforms all the baseline methods and is more preferable in human evaluations.
arXiv Detail & Related papers (2024-06-20T17:50:16Z)
Dynamic Handover: Throw and Catch with Bimanual Hands [30.206469112964033]
We design a system with two multi-finger hands attached to robot arms to solve this problem. We train our system using Multi-Agent Reinforcement Learning in simulation and perform Sim2Real transfer to deploy on the real robots. To overcome the Sim2Real gap, we provide multiple novel algorithm designs including learning a trajectory prediction model for the object.
arXiv Detail & Related papers (2023-09-11T17:49:25Z)
Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z)
Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning [4.260312058817663]
We build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot. This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour. RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning.
arXiv Detail & Related papers (2023-04-24T13:24:00Z)
Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning [61.3506230781327]
In robotics, one approach to generate training data builds on simulations based on dynamics models derived from first principles. Here, we leverage the imbalance in complexity of the dynamics to learn more sample-efficiently. We validate our method on several challenging simulated tasks and demonstrate that it improves learning both alone and when combined with an existing hindsight algorithm.
arXiv Detail & Related papers (2023-03-03T21:55:04Z)
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on. In this work, we propose MEDAL++, a novel design for self-improving robotic systems. The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z)
Robot Learning from Randomized Simulations: A Review [59.992761565399185]
Deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data. State-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive. We focus on a technique named 'domain randomization' which is a method for learning from randomized simulations.
arXiv Detail & Related papers (2021-11-01T13:55:41Z)
Reactive Long Horizon Task Execution via Visual Skill and Precondition Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)
Learning to Play Table Tennis From Scratch using Muscular Robots [34.34824536814943]
This work is the first to (a) fail-safe learn of a safety-critical dynamic task using anthropomorphic robot arms, (b) learn a precision-demanding problem with a PAM-driven system, and (c) train robots to play table tennis without real balls. Videos and datasets are available at muscularTT.embodied.ml.
arXiv Detail & Related papers (2020-06-10T16:43:27Z)
Dynamic Experience Replay [6.062589413216726]
We build upon Ape-X DDPG and demonstrate our approach on robotic tight-fitting joint assembly tasks. In particular, we run experiments on two different tasks: peg-in-hole and lap-joint. Our ablation studies show that Dynamic Experience Replay is a crucial ingredient that either largely shortens the training time in these challenging environments.
arXiv Detail & Related papers (2020-03-04T23:46:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.