Learning Continuous Environment Fields via Implicit Functions
- URL: http://arxiv.org/abs/2111.13997v1
- Date: Sat, 27 Nov 2021 22:36:58 GMT
- Title: Learning Continuous Environment Fields via Implicit Functions
- Authors: Xueting Li, Shalini De Mello, Xiaolong Wang, Ming-Hsuan Yang, Jan
Kautz, Sifei Liu
- Abstract summary: We propose a novel scene representation that encodes reaching distance -- the distance between any position in the scene to a goal along a feasible trajectory.
We demonstrate that this environment field representation can directly guide the dynamic behaviors of agents in 2D mazes or 3D indoor scenes.
- Score: 144.4913852552954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel scene representation that encodes reaching distance -- the
distance between any position in the scene to a goal along a feasible
trajectory. We demonstrate that this environment field representation can
directly guide the dynamic behaviors of agents in 2D mazes or 3D indoor scenes.
Our environment field is a continuous representation and learned via a neural
implicit function using discretely sampled training data. We showcase its
application for agent navigation in 2D mazes, and human trajectory prediction
in 3D indoor environments. To produce physically plausible and natural
trajectories for humans, we additionally learn a generative model that predicts
regions where humans commonly appear, and enforce the environment field to be
defined within such regions. Extensive experiments demonstrate that the
proposed method can generate both feasible and plausible trajectories
efficiently and accurately.
Related papers
- Volumetric Environment Representation for Vision-Language Navigation [66.04379819772764]
Vision-language navigation (VLN) requires an agent to navigate through a 3D environment based on visual observations and natural language instructions.
We introduce a Volumetric Environment Representation (VER), which voxelizes the physical world into structured 3D cells.
VER predicts 3D occupancy, 3D room layout, and 3D bounding boxes jointly.
arXiv Detail & Related papers (2024-03-21T06:14:46Z) - CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting [15.392692128626809]
We propose CARFF, a method for predicting future 3D scenes given past observations.
We employ a two-stage training of Pose-Conditional-VAE and NeRF to learn 3D representations.
We demonstrate the utility of our method in scenarios using the CARLA driving simulator.
arXiv Detail & Related papers (2024-01-31T18:56:09Z) - WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in
Large-scale Natural Environments [34.24004079703609]
We introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scales in natural environments.
The data is trajectory-centric with accurate localization and globally aligned point clouds.
We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques.
arXiv Detail & Related papers (2023-12-23T22:27:40Z) - Visual Affordance Prediction for Guiding Robot Exploration [56.17795036091848]
We develop an approach for learning visual affordances for guiding robot exploration.
We use a Transformer-based model to learn a conditional distribution in the latent embedding space of a VQ-VAE.
We show how the trained affordance model can be used for guiding exploration by acting as a goal-sampling distribution, during visual goal-conditioned policy learning in robotic manipulation.
arXiv Detail & Related papers (2023-05-28T17:53:09Z) - Synthesizing Diverse Human Motions in 3D Indoor Scenes [16.948649870341782]
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner.
Existing approaches rely on training sequences that contain captured human motions and the 3D scenes they interact with.
We propose a reinforcement learning-based approach that enables virtual humans to navigate in 3D scenes and interact with objects realistically and autonomously.
arXiv Detail & Related papers (2023-05-21T09:22:24Z) - Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory
Diffusion [83.88829943619656]
We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals.
Our guided diffusion model allows users to constrain trajectories through target waypoints, speed, and specified social groups.
We propose utilizing the value function learned during RL training of the animation controller to guide diffusion to produce trajectories better suited for particular scenarios.
arXiv Detail & Related papers (2023-04-04T15:46:42Z) - Neural Poisson: Indicator Functions for Neural Fields [25.41908065938424]
Implicit neural field generating signed distance field representations (SDFs) of 3D shapes have shown remarkable progress.
We introduce a new paradigm for neural field representations of 3D scenes.
We show that our approach demonstrates state-of-the-art reconstruction performance on both synthetic and real scanned 3D scene data.
arXiv Detail & Related papers (2022-11-25T17:28:22Z) - Pose2Room: Understanding 3D Scenes from Human Activities [35.702234343672565]
With wearable IMU sensors, one can estimate human poses from wearable devices without requiring visual input.
We show that P2R-Net can effectively learn multi-modal distributions of likely objects for human motions.
arXiv Detail & Related papers (2021-12-01T20:54:36Z) - Environment Predictive Coding for Embodied Agents [92.31905063609082]
We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents.
Our experiments on the photorealistic 3D environments of Gibson and Matterport3D show that our method outperforms the state-of-the-art on challenging tasks with only a limited budget of experience.
arXiv Detail & Related papers (2021-02-03T23:43:16Z) - Long-term Human Motion Prediction with Scene Context [60.096118270451974]
We propose a novel three-stage framework for predicting human motion.
Our method first samples multiple human motion goals, then plans 3D human paths towards each goal, and finally predicts 3D human pose sequences following each path.
arXiv Detail & Related papers (2020-07-07T17:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.