Related papers: Beyond the Frontier: Predicting Unseen Walls from Occupancy Grids by Learning from Floor Plans

Beyond the Frontier: Predicting Unseen Walls from Occupancy Grids by Learning from Floor Plans

URL: http://arxiv.org/abs/2406.09160v1
Date: Thu, 13 Jun 2024 14:22:59 GMT
Title: Beyond the Frontier: Predicting Unseen Walls from Occupancy Grids by Learning from Floor Plans
Authors: Ludvig Ericson, Patric Jensfelt,
Abstract summary: We tackle the challenge of predicting the unseen walls of a partially observed environment as a set of 2D line segments conditioned on occupancy grids integrated along the trajectory of a 360deg LIDAR sensor. A dataset of such occupancy grids and their corresponding target wall segments is collected by navigating a virtual robot between a set of randomly sampled waypoints in a collection of office-scale floor plans from a university campus. The line segment prediction task is formulated as an autoregressive sequence prediction task, and an attention-based deep network is trained on the dataset.
Score: 3.432284729311483
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we tackle the challenge of predicting the unseen walls of a partially observed environment as a set of 2D line segments, conditioned on occupancy grids integrated along the trajectory of a 360{\deg} LIDAR sensor. A dataset of such occupancy grids and their corresponding target wall segments is collected by navigating a virtual robot between a set of randomly sampled waypoints in a collection of office-scale floor plans from a university campus. The line segment prediction task is formulated as an autoregressive sequence prediction task, and an attention-based deep network is trained on the dataset. The sequence-based autoregressive formulation is evaluated through predicted information gain, as in frontier-based autonomous exploration, demonstrating significant improvements over both non-predictive estimation and convolution-based image prediction found in the literature. Ablations on key components are evaluated, as well as sensor range and the occupancy grid's metric area. Finally, model generality is validated by predicting walls in a novel floor plan reconstructed on-the-fly in a real-world office environment.

Related papers

ForecastOcc: Vision-based Semantic Occupancy Forecasting [16.699381591572163]
We present ForecastOcc, the first framework for vision-based semantic occupancy forecasting that predicts future occupancy states and semantic categories.<n>Our framework yields semantic occupancy forecasts for multiple horizons directly from past camera images, without relying on externally estimated maps.
arXiv Detail & Related papers (2026-02-08T15:16:06Z)
Into the Unknown: Towards using Generative Models for Sampling Priors of Environment Uncertainty for Planning in Configuration Spaces [28.37021202108478]
Priors are vital for planning under partial observability, yet difficult to obtain in practice.<n>We present a probabilistic-based pipeline that leverages large-scale pretrained generative models to produce priors in a zero-shot manner.<n>We establish a Matterport3D benchmark of rooms partially visible through doorways, where a robot must navigate to an unobserved target object.
arXiv Detail & Related papers (2025-10-13T05:08:48Z)
TGP: Two-modal occupancy prediction with 3D Gaussian and sparse points for 3D Environment Awareness [13.68631587423815]
3D semantic occupancy has rapidly become a research focus in the fields of robotics and autonomous driving environment perception. Existing occupancy prediction tasks are modeled using voxel or point cloud-based approaches. We propose a dual-modal prediction method based on 3D Gaussian sets and sparse points, which balances both spatial location and volumetric structural information.
arXiv Detail & Related papers (2025-03-13T01:35:04Z)
Unified Human Localization and Trajectory Prediction with Monocular Vision [64.19384064365431]
MonoTransmotion is a Transformer-based framework that uses only a monocular camera to jointly solve localization and prediction tasks. We show that by jointly training both tasks with our unified framework, our method is more robust in real-world scenarios made of noisy inputs.
arXiv Detail & Related papers (2025-03-05T14:18:39Z)
Advancing ALS Applications with Large-Scale Pre-training: Dataset Development and Downstream Assessment [6.606615641354963]
The pre-training and fine-tuning paradigm has revolutionized satellite remote sensing applications. We construct a large-scale ALS point cloud dataset and evaluate its impact on downstream applications. Our results show that the pre-trained models significantly outperform their scratch counterparts across all downstream tasks.
arXiv Detail & Related papers (2025-01-09T09:21:09Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data [10.968721742000653]
We propose an effective deep-learning method within a two-step prediction framework. Our method first incorporates remote sensing data, capturing pivotal environmental context. We construct the QR-P graph for the user's historical trajectories to encapsulate historical travel knowledge.
arXiv Detail & Related papers (2024-03-22T04:22:36Z)
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision. We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range. For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z)
JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios. This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective. The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z)
1st Place Solution for PSG competition with ECCV'22 SenseHuman Workshop [1.5362025549031049]
Panoptic Scene Graph (PSG) generation aims to generate scene graph representations based on panoptic segmentation instead of rigid bounding boxes. We propose GRNet, a Global Relation Network in two-stage paradigm, where the pre-extracted local object features and their corresponding masks are fed into a transformer with class embeddings. We conduct comprehensive experiments on OpenPSG dataset and achieve the state-of-art performance on the leadboard.
arXiv Detail & Related papers (2023-02-06T09:47:46Z)
Patch-level Gaze Distribution Prediction for Gaze Following [49.93340533068501]
We introduce the patch distribution prediction ( PDP) method for gaze following training. We show that our model regularizes the MSE loss by predicting better heatmap distributions on images with larger annotation variances. Experiments show that our model bridging the gap between the target prediction and in/out prediction subtasks, showing a significant improvement on both subtasks on public gaze following datasets.
arXiv Detail & Related papers (2022-11-20T19:25:15Z)
A Multi-stage Framework with Mean Subspace Computation and Recursive Feedback for Online Unsupervised Domain Adaptation [9.109788577327503]
We propose a novel framework to solve real-world situations when the target data are unlabeled and arriving online sequentially in batches. To project the data from the source and the target domains to a common subspace and manipulate the projected data in real-time, our proposed framework institutes a novel method. Experiments on six datasets were conducted to investigate in depth the effect and contribution of each stage in our proposed framework.
arXiv Detail & Related papers (2022-06-24T03:50:34Z)
FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics [5.71097144710995]
We show that by modelling floor plans as sequences of line segments seen from a particular point of view, recent advances in autoregressive sequence modelling can be leveraged to model and predict floor plans.
arXiv Detail & Related papers (2022-03-07T13:42:48Z)
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks. To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame. Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction [24.855059537779294]
We present a novel end-to-end two-stage network: Spatio--Interactive Network (STINet) In addition to 3D geometry of pedestrians, we model temporal information for each of the pedestrians. Our method predicts both current and past locations in the first stage, so that each pedestrian can be linked across frames.
arXiv Detail & Related papers (2020-05-08T18:43:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.