Related papers: OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model

OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model

URL: http://arxiv.org/abs/2408.10618v2
Date: Thu, 05 Dec 2024 06:54:29 GMT
Title: OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model
Authors: Junming Wang, Xiuxian Guan, Zekai Sun, Tianxiang Shen, Dong Huang, Fangming Liu, Heming Cui,
Abstract summary: Air-ground robots (AGRs) are widely used in surveillance and disaster response.<n>Current AGR navigation systems perform well in static environments.<n>However, these systems face challenges in dynamic, severe occlusion scenes.<n>We propose OccMamba with an Efficient AGR-Planner to address these problems.
Score: 12.096387853748938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Air-ground robots (AGRs) are widely used in surveillance and disaster response due to their exceptional mobility and versatility (i.e., flying and driving). Current AGR navigation systems perform well in static occlusion-prone environments (e.g., indoors) by using 3D semantic occupancy networks to predict occlusions for complete local mapping and then computing Euclidean Signed Distance Field (ESDF) for path planning. However, these systems face challenges in dynamic, severe occlusion scenes (e.g., crowds) due to limitations in perception networks' low prediction accuracy and path planners' high computation overhead. In this paper, we propose OMEGA, which contains OccMamba with an Efficient AGR-Planner to address the above-mentioned problems. OccMamba adopts a novel architecture that separates semantic and occupancy prediction into independent branches, incorporating two mamba blocks within these branches. These blocks efficiently extract semantic and geometric features in 3D environments with linear complexity, ensuring that the network can learn long-distance dependencies to improve prediction accuracy. Semantic and geometric features are combined within the Bird's Eye View (BEV) space to minimise computational overhead during feature fusion. The resulting semantic occupancy map is then seamlessly integrated into the local map, providing occlusion awareness of the dynamic environment. Our AGR-Planner utilizes this local map and employs kinodynamic A* search and gradient-based trajectory optimization to guarantee planning is ESDF-free and energy-efficient. Extensive experiments demonstrate that OccMamba outperforms the state-of-the-art 3D semantic occupancy network with 25.0% mIoU. End-to-end navigation experiments in dynamic scenes verify OMEGA's efficiency, achieving a 96% average planning success rate. Code and video are available at https://jmwang0117.github.io/OMEGA/.

Related papers

Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics [1.7495208770207367]
We introduce a lightweight linear parametric map by first mapping data to a high-dimensional space, followed by a sparse random projection for dimensionality reduction.<n>For UAVs, our method grid and Euclidean Signed Distance Field (ESDF) maps.<n>For UGVs, the model characterizes terrain and provides closed-form gradients, enabling online planning to circumvent large holes.
arXiv Detail & Related papers (2025-07-12T16:39:19Z)
NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
Semantic Exploration and Dense Mapping of Complex Environments using Ground Robots Equipped with LiDAR and Panoramic Camera [7.330549613211134]
This paper presents a system for autonomous semantic exploration and dense semantic target mapping of a complex unknown environment using a ground robot equipped with a LiDAR-panoramic camera suite.<n>We first redefine the task as completing both geometric coverage and semantic viewpoint observation. We then manage semantic and geometric viewpoints separately and propose a novel Priority-driven Decoupled Local Sampler to generate local viewpoint sets.<n>In addition, we propose a Safe Aggressive Exploration State Machine, which allows aggressive exploration behavior while ensuring the robot's safety.
arXiv Detail & Related papers (2025-05-28T21:27:32Z)
EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler [43.277357306520145]
This paper introduces EmbodiedOcc++, enhancing the original framework with two key innovations. A Geometry-guided Refinement Module (GRM) constrains Gaussian updates through plane regularization, along with a Semantic-aware Uncertainty Sampler (SUS) Experiments on the EmbodiedOcc-ScanNet benchmark demonstrate that EmbodiedOcc achieves state-of-the-art performance across different settings.
arXiv Detail & Related papers (2025-04-13T12:10:49Z)
Co-SemDepth: Fast Joint Semantic Segmentation and Depth Estimation on Aerial Images [0.9883261192383611]
In this paper, we leverage monocular cameras on aerial robots to predict depth and semantic maps in unstructured environments. We propose a joint deep-learning architecture that can perform the two tasks accurately and rapidly.
arXiv Detail & Related papers (2025-03-23T08:25:07Z)
Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian Process [14.428139979659395]
Terrain analysis is critical for the practical application of ground mobile robots in real-world tasks. We propose a novel spatial-temporal traversability assessment method, which aims to enable autonomous robots to navigate through complex terrains. We develop an autonomous navigation framework integrated with the traversability map and validate it with a differential driven vehicle in complex outdoor environments.
arXiv Detail & Related papers (2025-03-06T06:26:57Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
NavTopo: Leveraging Topological Maps For Autonomous Navigation Of a Mobile Robot [1.0550841723235613]
We propose a full navigation pipeline based on topological map and two-level path planning. The pipeline localizes in the graph by matching neural network descriptors and 2D projections of the input point clouds. We test our approach in a large indoor photo-relaistic simulated environment and compare it to a metric map-based approach based on popular metric mapping method RTAB-MAP.
arXiv Detail & Related papers (2024-10-15T10:54:49Z)
OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity [11.287721740276048]
3D semantic occupancy prediction networks have demonstrated remarkable capabilities in reconstructing the geometric and semantic structure of 3D scenes. We introduce OccRWKV, an efficient semantic occupancy network inspired by Receptance Weighted Key Value (RWKV) OccRWKV separates semantics, occupancy prediction, and feature fusion into distinct branches, each incorporating Sem-RWKV and Geo-RWKV blocks.
arXiv Detail & Related papers (2024-09-30T06:27:50Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
Vision-aided UAV navigation and dynamic obstacle avoidance using gradient-based B-spline trajectory optimization [7.874708385247353]
This paper proposes a gradient-based B-spline trajectory optimization algorithm utilizing the robot's onboard vision. The proposed optimization first adopts the circle-based guide-point algorithm to approximate the costs and gradients for avoiding static obstacles. With the vision-detected moving objects, our receding-horizon distance field is simultaneously used to prevent dynamic collisions.
arXiv Detail & Related papers (2022-09-15T02:12:30Z)
Trajectory Prediction with Graph-based Dual-scale Context Fusion [43.51107329748957]
We present a graph-based trajectory prediction network named the Dual Scale Predictor. It encodes both the static and dynamical driving context in a hierarchical manner. Thanks to the proposed dual-scale context fusion network, our DSP is able to generate accurate and human-like multi-modal trajectories.
arXiv Detail & Related papers (2021-11-02T13:42:16Z)
Large-scale Autonomous Flight with Real-time Semantic SLAM under Dense Forest Canopy [48.51396198176273]
We propose an integrated system that can perform large-scale autonomous flights and real-time semantic mapping in challenging under-canopy environments. We detect and model tree trunks and ground planes from LiDAR data, which are associated across scans and used to constrain robot poses as well as tree trunk models. A drift-compensation mechanism is designed to minimize the odometry drift using semantic SLAM outputs in real time, while maintaining planner optimality and controller stability.
arXiv Detail & Related papers (2021-09-14T07:24:53Z)
Trajectory Design for UAV-Based Internet-of-Things Data Collection: A Deep Reinforcement Learning Approach [93.67588414950656]
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a 3D environment. We present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
arXiv Detail & Related papers (2021-07-23T03:33:29Z)
Occupancy Anticipation for Efficient Exploration and Navigation [97.17517060585875]
We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions. By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment. Our approach is the winning entry in the 2020 Habitat PointNav Challenge.
arXiv Detail & Related papers (2020-08-21T03:16:51Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.