Related papers: COMPOSER: Scalable and Robust Modular Policies for Snake Robots

COMPOSER: Scalable and Robust Modular Policies for Snake Robots

URL: http://arxiv.org/abs/2310.00871v1
Date: Mon, 2 Oct 2023 03:20:31 GMT
Title: COMPOSER: Scalable and Robust Modular Policies for Snake Robots
Authors: Yuyou Zhang, Yaru Niu, Xingyu Liu, Ding Zhao
Abstract summary: Snake robots have remarkable compliance and adaptability in their interaction with environments. We seek to develop a control policy that effectively breaks down the high dimensionality of snake robots. A high-level imagination policy is proposed to provide additional rewards to guide the low-level control policy.
Score: 33.19973755461351
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Snake robots have showcased remarkable compliance and adaptability in their interaction with environments, mirroring the traits of their natural counterparts. While their hyper-redundant and high-dimensional characteristics add to this adaptability, they also pose great challenges to robot control. Instead of perceiving the hyper-redundancy and flexibility of snake robots as mere challenges, there lies an unexplored potential in leveraging these traits to enhance robustness and generalizability at the control policy level. We seek to develop a control policy that effectively breaks down the high dimensionality of snake robots while harnessing their redundancy. In this work, we consider the snake robot as a modular robot and formulate the control of the snake robot as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. Each segment of the snake robot functions as an individual agent. Specifically, we incorporate a self-attention mechanism to enhance the cooperative behavior between agents. A high-level imagination policy is proposed to provide additional rewards to guide the low-level control policy. We validate the proposed method COMPOSER with five snake robot tasks, including goal reaching, wall climbing, shape formation, tube crossing, and block pushing. COMPOSER achieves the highest success rate across all tasks when compared to a centralized baseline and four modular policy baselines. Additionally, we show enhanced robustness against module corruption and significantly superior zero-shot generalizability in our proposed method. The videos of this work are available on our project page: https://sites.google.com/view/composer-snake/.

Related papers

Ctrl-World: A Controllable Generative World Model for Robot Manipulation [53.71061464925014]
Generalist robot policies can perform a wide range of manipulation skills.<n> evaluating and improving their ability with unfamiliar objects and instructions remains a significant challenge.<n>World models offer a promising, scalable alternative by enabling policies to rollout within imagination space.
arXiv Detail & Related papers (2025-10-11T09:13:10Z)
$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z)
GRAPPA: Generalizing and Adapting Robot Policies via Online Agentic Guidance [15.774237279917594]
We propose an agentic framework for robot self-guidance and self-improvement. Our framework iteratively grounds a base robot policy to relevant objects in the environment. We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates.
arXiv Detail & Related papers (2024-10-09T02:00:37Z)
Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data. Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability. We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation. We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z)
Learning Vision-based Pursuit-Evasion Robot Policies [54.52536214251999]
We develop a fully-observable robot policy that generates supervision for a partially-observable one. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild.
arXiv Detail & Related papers (2023-08-30T17:59:05Z)
Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers. We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies. We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z)
Polybot: Training One Policy Across Robots While Embracing Variability [70.74462430582163]
We propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes.
arXiv Detail & Related papers (2023-07-07T17:21:16Z)
Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement Learning [23.164743388342803]
We study how to solve bi-manual tasks using reinforcement learning trained in simulation. We also discuss modifications to our simulated environment which lead to effective training of RL policies. In this work, we design a Connect Task, where the aim is for two robot arms to pick up and attach two blocks with magnetic connection points.
arXiv Detail & Related papers (2022-03-15T21:49:20Z)
REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer [57.045140028275036]
We consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology. Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal action and/or state distribution being mismatched in different robots. We propose a novel method named $REvolveR$ of using continuous evolutionary models for robotic policy transfer implemented in a physics simulator.
arXiv Detail & Related papers (2022-02-10T18:50:25Z)
Decentralized Global Connectivity Maintenance for Multi-Robot Navigation: A Reinforcement Learning Approach [12.649986200029717]
This work investigates how to navigate a multi-robot team in unknown environments while maintaining connectivity. We propose a reinforcement learning approach to develop a decentralized policy, which is shared among multiple robots. We validate the effectiveness of the proposed approach by comparing different combinations of connectivity constraints and behavior cloning.
arXiv Detail & Related papers (2021-09-17T13:20:19Z)
An Energy-Saving Snake Locomotion Gait Policy Using Deep Reinforcement Learning [0.0]
In this work, a snake locomotion gait policy is developed via deep reinforcement learning (DRL) for energy-efficient control. We apply proximal policy optimization (PPO) to each joint motor parameterized by angular velocity and the DRL agent learns the standard serpenoid curve at each timestep. Comparing to conventional control strategies, the snake robots controlled by the trained PPO agent can achieve faster movement and more energy-efficient locomotion gait.
arXiv Detail & Related papers (2021-03-08T02:06:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.