Ditto in the House: Building Articulation Models of Indoor Scenes
through Interactive Perception
- URL: http://arxiv.org/abs/2302.01295v1
- Date: Thu, 2 Feb 2023 18:22:00 GMT
- Title: Ditto in the House: Building Articulation Models of Indoor Scenes
through Interactive Perception
- Authors: Cheng-Chun Hsu and Zhenyu Jiang and Yuke Zhu
- Abstract summary: This work explores building articulation models of indoor scenes through a robot's purposeful interactions.
We introduce an interactive perception approach to this task.
We demonstrate the effectiveness of our approach in both simulation and real-world scenes.
- Score: 31.009703947432026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtualizing the physical world into virtual models has been a critical
technique for robot navigation and planning in the real world. To foster
manipulation with articulated objects in everyday life, this work explores
building articulation models of indoor scenes through a robot's purposeful
interactions in these scenes. Prior work on articulation reasoning primarily
focuses on siloed objects of limited categories. To extend to room-scale
environments, the robot has to efficiently and effectively explore a
large-scale 3D space, locate articulated objects, and infer their
articulations. We introduce an interactive perception approach to this task.
Our approach, named Ditto in the House, discovers possible articulated objects
through affordance prediction, interacts with these objects to produce
articulated motions, and infers the articulation properties from the visual
observations before and after each interaction. It tightly couples affordance
prediction and articulation inference to improve both tasks. We demonstrate the
effectiveness of our approach in both simulation and real-world scenes. Code
and additional results are available at
https://ut-austin-rpl.github.io/HouseDitto/
Related papers
- Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - RPMArt: Towards Robust Perception and Manipulation for Articulated Objects [56.73978941406907]
It is essential that robots can exhibit robust perception and manipulation for articulated objects in real-world robotic applications.
We propose a framework towards Robust Perception and Manipulation for Articulated Objects ( RPMArt)
RPMArt learns to estimate the articulation parameters and manipulate the articulation part from the noisy point cloud.
arXiv Detail & Related papers (2024-03-24T05:55:39Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Synthesizing Diverse Human Motions in 3D Indoor Scenes [16.948649870341782]
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner.
Existing approaches rely on training sequences that contain captured human motions and the 3D scenes they interact with.
We propose a reinforcement learning-based approach that enables virtual humans to navigate in 3D scenes and interact with objects realistically and autonomously.
arXiv Detail & Related papers (2023-05-21T09:22:24Z) - Affordances from Human Videos as a Versatile Representation for Robotics [31.248842798600606]
We train a visual affordance model that estimates where and how in the scene a human is likely to interact.
The structure of these behavioral affordances directly enables the robot to perform many complex tasks.
We show the efficacy of our approach, which we call VRB, across 4 real world environments, over 10 different tasks, and 2 robotic platforms operating in the wild.
arXiv Detail & Related papers (2023-04-17T17:59:34Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter.
We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping.
We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z) - iGibson, a Simulation Environment for Interactive Tasks in Large
Realistic Scenes [54.04456391489063]
iGibson is a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.
Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects.
iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors.
arXiv Detail & Related papers (2020-12-05T02:14:17Z) - Learning Object-Based State Estimators for Household Robots [11.055133590909097]
We build object-based memory systems that operate on high-dimensional observations and hypotheses.
We demonstrate the system's effectiveness in maintaining memory of dynamically changing objects in both simulated environment and real images.
arXiv Detail & Related papers (2020-11-06T04:18:52Z) - Hindsight for Foresight: Unsupervised Structured Dynamics Models from
Physical Interaction [24.72947291987545]
Key challenge for an agent learning to interact with the world is to reason about physical properties of objects.
We propose a novel approach for modeling the dynamics of a robot's interactions directly from unlabeled 3D point clouds and images.
arXiv Detail & Related papers (2020-08-02T11:04:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.