ARMOR: Egocentric Perception for Humanoid Robot Collision Avoidance and Motion Planning
- URL: http://arxiv.org/abs/2412.00396v1
- Date: Sat, 30 Nov 2024 08:39:23 GMT
- Title: ARMOR: Egocentric Perception for Humanoid Robot Collision Avoidance and Motion Planning
- Authors: Daehwa Kim, Mario Srouji, Chen Chen, Jian Zhang,
- Abstract summary: ARMOR is a novel egocentric perception system for humanoid robots.<n>Our distributed perception approach enhances the robot's spatial awareness.<n>We show that our ARMOR perception is superior against a setup with multiple dense head-mounted, and externally mounted depth cameras.
- Score: 10.207814069339735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humanoid robots have significant gaps in their sensing and perception, making it hard to perform motion planning in dense environments. To address this, we introduce ARMOR, a novel egocentric perception system that integrates both hardware and software, specifically incorporating wearable-like depth sensors for humanoid robots. Our distributed perception approach enhances the robot's spatial awareness, and facilitates more agile motion planning. We also train a transformer-based imitation learning (IL) policy in simulation to perform dynamic collision avoidance, by leveraging around 86 hours worth of human realistic motions from the AMASS dataset. We show that our ARMOR perception is superior against a setup with multiple dense head-mounted, and externally mounted depth cameras, with a 63.7% reduction in collisions, and 78.7% improvement on success rate. We also compare our IL policy against a sampling-based motion planning expert cuRobo, showing 31.6% less collisions, 16.9% higher success rate, and 26x reduction in computational latency. Lastly, we deploy our ARMOR perception on our real-world GR1 humanoid from Fourier Intelligence. We are going to update the link to the source code, HW description, and 3D CAD files in the arXiv version of this text.
Related papers
- MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction [54.36564144414704]
MeshMimic is an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled "motion-terrain" interactions directly from video.<n>By leveraging state-of-the-art 3D vision models, our framework precisely segments and reconstructs both human trajectories and the underlying 3D geometry of terrains and objects.
arXiv Detail & Related papers (2026-02-17T17:09:45Z) - XR-DT: Extended Reality-Enhanced Digital Twin for Agentic Mobile Robots [10.083050242188422]
This paper presents XR-DT, an eXtended Reality-enhanced Digital Twin framework for agentic mobile robots.<n>By embedding human intention, environmental dynamics, and robot cognition into the XR-DT framework, our system enables interpretable, trustworthy, and adaptive HRI.
arXiv Detail & Related papers (2025-12-04T21:49:14Z) - Dexterity from Smart Lenses: Multi-Fingered Robot Manipulation with In-the-Wild Human Demonstrations [52.29884993824894]
Learning multi-fingered robot policies from humans performing daily tasks in natural environments has long been a grand goal in the robotics community.<n>AINA enables learning multi-fingered policies from data collected by anyone, anywhere, and in any environment using Aria Gen 2 glasses.
arXiv Detail & Related papers (2025-11-20T18:59:02Z) - GaussGym: An open-source real-to-sim framework for learning locomotion from pixels [78.05453137978132]
We present a novel approach for photorealistic robot simulation that integrates 3D Gaussian Splatting as a drop-in within vectorized physics simulators.<n>This enables unprecedented speed -- exceeding 100,000 steps per second on consumer GPU.<n>We additionally demonstrate its applicability in a sim-to-real robotics setting.
arXiv Detail & Related papers (2025-10-17T06:34:52Z) - Social-Pose: Enhancing Trajectory Prediction with Human Body Pose [70.59399670794171]
We study the benefits of predicting human trajectories using human body poses instead of solely their Cartesian space locations in time.<n>We propose Social-pose', an attention-based pose encoder that effectively captures the poses of all humans in a scene and their social relations.
arXiv Detail & Related papers (2025-07-30T14:58:48Z) - GRUtopia: Dream General Robots in a City at Scale [65.08318324604116]
This paper introduces project GRUtopia, the first simulated interactive 3D society designed for various robots.
GRScenes includes 100k interactive, finely annotated scenes, which can be freely combined into city-scale environments.
GRResidents is a Large Language Model (LLM) driven Non-Player Character (NPC) system that is responsible for social interaction.
arXiv Detail & Related papers (2024-07-15T17:40:46Z) - Inverse Kinematics for Neuro-Robotic Grasping with Humanoid Embodied Agents [13.53738829631595]
This paper introduces a novel zero-shot motion planning method that allows users to quickly design smooth robot motions in Cartesian space.
A B'ezier curve-based Cartesian plan is transformed into a joint space trajectory by our neuro-inspired inverse kinematics (IK) method CycleIK.
The motion planner is evaluated on the physical hardware of the two humanoid robots NICO and NICOL in a human-in-the-loop grasping scenario.
arXiv Detail & Related papers (2024-04-12T21:42:34Z) - Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset [52.22758311559]
We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot.
The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors.
The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users.
arXiv Detail & Related papers (2024-03-21T14:53:50Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - COPILOT: Human-Environment Collision Prediction and Localization from
Egocentric Videos [62.34712951567793]
The ability to forecast human-environment collisions from egocentric observations is vital to enable collision avoidance in applications such as VR, AR, and wearable assistive robotics.
We introduce the challenging problem of predicting collisions in diverse environments from multi-view egocentric videos captured from body-mounted cameras.
We propose a transformer-based model called COPILOT to perform collision prediction and localization simultaneously.
arXiv Detail & Related papers (2022-10-04T17:49:23Z) - Regularized Deep Signed Distance Fields for Reactive Motion Generation [30.792481441975585]
Distance-based constraints are fundamental for enabling robots to plan their actions and act safely.
We propose Regularized Deep Signed Distance Fields (ReDSDF), a single neural implicit function that can compute smooth distance fields at any scale.
We demonstrate the effectiveness of our approach in representative simulated tasks for whole-body control (WBC) and safe Human-Robot Interaction (HRI) in shared workspaces.
arXiv Detail & Related papers (2022-03-09T14:21:32Z) - Towards Disturbance-Free Visual Mobile Manipulation [11.738161077441104]
We develop a new disturbance-avoidance methodology at the heart of which is the auxiliary task of disturbance prediction.
Our experiments on ManipulaTHOR show that, on testing scenes with novel objects, our method improves the success rate from 61.7% to 85.6%.
arXiv Detail & Related papers (2021-12-17T22:33:23Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - Deep Reinforcement learning for real autonomous mobile robot navigation
in indoor environments [0.0]
We present our proof of concept for autonomous self-learning robot navigation in an unknown environment for a real robot without a map or planner.
The input for the robot is only the fused data from a 2D laser scanner and a RGB-D camera as well as the orientation to the goal.
The output actions of an Asynchronous Advantage Actor-Critic network (GA3C) are the linear and angular velocities for the robot.
arXiv Detail & Related papers (2020-05-28T09:15:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.