Related papers: Training and Simulation of Quadrupedal Robot in Adaptive Stair Climbing for Indoor Firefighting: An End-to-End Reinforcement Learning Approach

Training and Simulation of Quadrupedal Robot in Adaptive Stair Climbing for Indoor Firefighting: An End-to-End Reinforcement Learning Approach

URL: http://arxiv.org/abs/2602.03087v1
Date: Tue, 03 Feb 2026 04:23:50 GMT
Title: Training and Simulation of Quadrupedal Robot in Adaptive Stair Climbing for Indoor Firefighting: An End-to-End Reinforcement Learning Approach
Authors: Baixiao Huang, Baiyu Huang, Yu Hou,
Abstract summary: Quadruped robots are used for primary searches during the early stages of indoor fires.<n> situational awareness in complex indoor environments and rapid stair climbing across different staircases remain the main challenges.<n>This project explores how to balance navigation and locomotion and how end-to-end RL methods can enable quadrupeds to adapt to different stair shapes.
Score: 4.901516178319544
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quadruped robots are used for primary searches during the early stages of indoor fires. A typical primary search involves quickly and thoroughly looking for victims under hazardous conditions and monitoring flammable materials. However, situational awareness in complex indoor environments and rapid stair climbing across different staircases remain the main challenges for robot-assisted primary searches. In this project, we designed a two-stage end-to-end deep reinforcement learning (RL) approach to optimize both navigation and locomotion. In the first stage, the quadrupeds, Unitree Go2, were trained to climb stairs in Isaac Lab's pyramid-stair terrain. In the second stage, the quadrupeds were trained to climb various realistic indoor staircases in the Isaac Lab engine, with the learned policy transferred from the previous stage. These indoor staircases are straight, L-shaped, and spiral, to support climbing tasks in complex environments. This project explores how to balance navigation and locomotion and how end-to-end RL methods can enable quadrupeds to adapt to different stair shapes. Our main contributions are: (1) A two-stage end-to-end RL framework that transfers stair-climbing skills from abstract pyramid terrain to realistic indoor stair topologies. (2) A centerline-based navigation formulation that enables unified learning of navigation and locomotion without hierarchical planning. (3) Demonstration of policy generalization across diverse staircases using only local height-map perception. (4) An empirical analysis of success, efficiency, and failure modes under increasing stair difficulty.

Related papers

Learning Transferability: A Two-Stage Reinforcement Learning Approach for Enhancing Quadruped Robots' Performance in U-Shaped Stair Climbing [4.901516178319544]
We employ a two-stage end-to-end deep reinforcement learning approach to optimize a robot's performance on U-shaped stairs.<n>The training robot-dog modality, Unitree Go2, was first trained to climb stairs on Isaac Lab's pyramid-stair terrain.<n>The results showed (1) the successful goal reached for robot dogs climbing U-shaped stairs with a stall penalty, and (2) the transferability from the policy trained on U-shaped stairs to deployment on straight, L-shaped, and spiral stair terrains.
arXiv Detail & Related papers (2026-02-16T05:19:06Z)
From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning [59.88543114325153]
We introduce the Seeing-to-Experiencing framework to scale the capability of navigation foundation models with reinforcement learning.<n>S2E combines the strengths of pre-training on videos and post-training through RL.<n>We establish a comprehensive end-to-end evaluation benchmark, NavBench-GS, built on photorealistic 3DGS reconstructions of real-world scenes.
arXiv Detail & Related papers (2025-07-29T17:26:10Z)
BeamDojo: Learning Agile Humanoid Locomotion on Sparse Footholds [35.62230804783507]
Existing learning-based approaches often struggle on complex terrains due to sparse foothold rewards and inefficient learning processes.<n>We introduce BeamDojo, a reinforcement learning framework designed for enabling agile humanoid locomotion on sparse footholds.<n>We show that BeamDojo achieves efficient learning in simulation and enables agile locomotion with precise foot placement on sparse footholds in the real world.
arXiv Detail & Related papers (2025-02-14T18:42:42Z)
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control [106.32794844077534]
This paper presents a study on using deep reinforcement learning to create dynamic locomotion controllers for bipedal robots. We develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. This work pushes the limits of agility for bipedal robots through extensive real-world experiments.
arXiv Detail & Related papers (2024-01-30T10:48:43Z)
StairNetV3: Depth-aware Stair Modeling using Deep Learning [6.145334325463317]
Vision-based stair perception can help autonomous mobile robots deal with the challenge of climbing stairs. Current monocular vision methods are difficult to model stairs accurately without depth information. This paper proposes a depth-aware stair modeling method for monocular vision.
arXiv Detail & Related papers (2023-08-13T08:11:40Z)
Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents. We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead. Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z)
Robust and Versatile Bipedal Jumping Control through Reinforcement Learning [141.56016556936865]
This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a reinforcement learning framework for training a robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions. We develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history.
arXiv Detail & Related papers (2023-02-19T01:06:09Z)
Deep Leaning-Based Ultra-Fast Stair Detection [6.362951673024623]
We propose an end-to-end method for stair line detection based on deep learning. In experiments, our method can achieve high performance in terms of both speed and accuracy. A lightweight version can even achieve 300+ frames per second with the same resolution.
arXiv Detail & Related papers (2022-01-14T02:05:01Z)
Learning Perceptual Locomotion on Uneven Terrains using Sparse Visual Observations [75.60524561611008]
This work aims to exploit the use of sparse visual observations to achieve perceptual locomotion over a range of commonly seen bumps, ramps, and stairs in human-centred environments. We first formulate the selection of minimal visual input that can represent the uneven surfaces of interest, and propose a learning framework that integrates such exteroceptive and proprioceptive data. We validate the learned policy in tasks that require omnidirectional walking over flat ground and forward locomotion over terrains with obstacles, showing a high success rate.
arXiv Detail & Related papers (2021-09-28T20:25:10Z)
Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations [52.696205074092006]
Generalization Through Imitation (GTI) is a two-stage offline imitation learning algorithm. GTI exploits a structure where demonstrated trajectories for different tasks intersect at common regions of the state space. In the first stage of GTI, we train a policy that leverages intersections to have the capacity to compose behaviors from different demonstration trajectories together. In the second stage of GTI, we train a goal-directed agent to generalize to novel start and goal configurations.
arXiv Detail & Related papers (2020-03-13T02:25:28Z)
augKlimb: Interactive Data-Led Augmentation of Bouldering Training [0.0]
Climbing is a popular sport, especially indoors, where climbers can train on man-made routes using artificial holds. Various aspects of adding computer-interaction to climbing have been studied in recent years. There is a large space for research into lightweight tools to aid recreational intermediate climbers.
arXiv Detail & Related papers (2020-01-22T10:26:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.