Related papers: CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation

CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation

URL: http://arxiv.org/abs/2506.03834v3
Date: Thu, 07 Aug 2025 07:19:28 GMT
Title: CARE: Enhancing Safety of Visual Navigation through Collision Avoidance via Repulsive Estimation
Authors: Joonkyung Kim, Joonyeol Sim, Woojun Kim, Katia Sycara, Changjoo Nam,
Abstract summary: We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods.<n> CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories.<n>We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms.
Score: 6.216878556851609
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose CARE (Collision Avoidance via Repulsive Estimation) to improve the robustness of learning-based visual navigation methods. Recently, visual navigation models, particularly foundation models, have demonstrated promising performance by generating viable trajectories using only RGB images. However, these policies can generalize poorly to environments containing out-of-distribution (OOD) scenes characterized by unseen objects or different camera setups (e.g., variations in field of view, camera pose, or focal length). Without fine-tuning, such models could produce trajectories that lead to collisions, necessitating substantial efforts in data collection and additional training. To address this limitation, we introduce CARE, an attachable module that enhances the safety of visual navigation without requiring additional range sensors or fine-tuning of pretrained models. CARE can be integrated seamlessly into any RGB-based navigation model that generates local robot trajectories. It dynamically adjusts trajectories produced by a pretrained model using repulsive force vectors computed from depth images estimated directly from RGB inputs. We evaluate CARE by integrating it with state-of-the-art visual navigation models across diverse robot platforms. Real-world experiments show that CARE significantly reduces collisions (up to 100%) without compromising navigation performance in goal-conditioned navigation, and further improves collision-free travel distance (up to 10.7x) in exploration tasks. Project page: https://airlab-sogang.github.io/CARE/

Related papers

NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
RGBTrack: Fast, Robust Depth-Free 6D Pose Estimation and Tracking [24.866881488130407]
We introduce a robust framework, RGBTrack, for real-time 6D pose estimation and tracking.<n>We devise a novel binary search strategy combined with a render-and-compare mechanism to efficiently infer depth.<n>We show that RGBTrack's novel depth-free approach achieves competitive accuracy and real-time performance.
arXiv Detail & Related papers (2025-06-20T16:19:28Z)
Human-Robot Navigation using Event-based Cameras and Reinforcement Learning [1.7614751781649955]
This work introduces a robot navigation controller that combines event cameras and other sensors with reinforcement learning to enable real-time human-centered navigation and obstacle avoidance.<n>Unlike conventional image-based controllers, which operate at fixed rates and suffer from motion blur and latency, this approach leverages the asynchronous nature of event cameras to process visual information over flexible time intervals.
arXiv Detail & Related papers (2025-06-12T15:03:08Z)
Improving Collision-Free Success Rate For Object Goal Visual Navigation Via Two-Stage Training With Collision Prediction [0.0]
Collision-free success is introduced to evaluate the ability of navigation models to find a collision-free path towards the target object.<n>A two-stage training method with collision prediction is proposed to improve the collision-free success rate of the existing navigation models.
arXiv Detail & Related papers (2025-02-19T07:33:10Z)
Monocular Obstacle Avoidance Based on Inverse PPO for Fixed-wing UAVs [29.207513994002202]
Fixed-wing Unmanned Aerial Vehicles (UAVs) are one of the most commonly used platforms for the Low-altitude Economy (LAE) and Urban Air Mobility (UAM)<n>Classical obstacle avoidance systems, which rely on prior maps or sophisticated sensors, face limitations in unknown low-altitude environments and small UAV platforms.<n>This paper proposes a lightweight deep reinforcement learning (DRL) based UAV collision avoidance system.
arXiv Detail & Related papers (2024-11-27T03:03:37Z)
DECADE: Towards Designing Efficient-yet-Accurate Distance Estimation Modules for Collision Avoidance in Mobile Advanced Driver Assistance Systems [5.383130566626935]
We present a distance estimation model, DECADE, that processes each detector output instead of constructing pixel-wise depth/disparity maps. We demonstrate that these modules can be attached to any detector to extend object detection with fast distance estimation.
arXiv Detail & Related papers (2024-10-25T06:40:42Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation [54.68738348071891]
We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation.
arXiv Detail & Related papers (2023-04-18T21:09:55Z)
COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos [62.34712951567793]
The ability to forecast human-environment collisions from egocentric observations is vital to enable collision avoidance in applications such as VR, AR, and wearable assistive robotics. We introduce the challenging problem of predicting collisions in diverse environments from multi-view egocentric videos captured from body-mounted cameras. We propose a transformer-based model called COPILOT to perform collision prediction and localization simultaneously.
arXiv Detail & Related papers (2022-10-04T17:49:23Z)
Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images. Our aim is to generate high-resolution images and videos from novel viewpoints. We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z)
Space Non-cooperative Object Active Tracking with Deep Reinforcement Learning [1.212848031108815]
We propose an end-to-end active visual tracking method based on DQN algorithm, named as DRLAVT. It can guide the chasing spacecraft approach to arbitrary space non-cooperative target merely relied on color or RGBD images. It significantly outperforms position-based visual servoing baseline algorithm that adopts state-of-the-art 2D monocular tracker, SiamRPN.
arXiv Detail & Related papers (2021-12-18T06:12:24Z)
Memory-Augmented Reinforcement Learning for Image-Goal Navigation [67.3963444878746]
We present a novel method that leverages a cross-episode memory to learn to navigate. In order to avoid overfitting, we propose to use data augmentation on the RGB input during training. We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.
arXiv Detail & Related papers (2021-01-13T16:30:20Z)
Object Rearrangement Using Learned Implicit Collision Functions [61.90305371998561]
We propose a learned collision model that accepts scene and query object point clouds and predicts collisions for 6DOF object poses within the scene. We leverage the learned collision model as part of a model predictive path integral (MPPI) policy in a tabletop rearrangement task. The learned model outperforms both traditional pipelines and learned ablations by 9.8% in accuracy on a dataset of simulated collision queries.
arXiv Detail & Related papers (2020-11-21T05:36:06Z)
Domain Adaptation for Outdoor Robot Traversability Estimation from RGB data with Safety-Preserving Loss [12.697106921197701]
We present an approach based on deep learning to estimate and anticipate the traversing score of different routes in the field of view of an on-board RGB camera. We then enhance the model's capabilities by addressing domain shifts through gradient-reversal unsupervised adaptation. Experimental results show that our approach is able to satisfactorily identify traversable areas and to generalize to unseen locations.
arXiv Detail & Related papers (2020-09-16T09:19:33Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.