Related papers: ROD: RGB-Only Fast and Efficient Off-road Freespace Detection

ROD: RGB-Only Fast and Efficient Off-road Freespace Detection

URL: http://arxiv.org/abs/2508.08697v1
Date: Tue, 12 Aug 2025 07:41:20 GMT
Title: ROD: RGB-Only Fast and Efficient Off-road Freespace Detection
Authors: Tong Sun, Hongliang Ye, Jilin Mei, Liang Chen, Fangzhou Zhao, Leiqiang Zong, Yu Hu,
Abstract summary: Off-road freespace detection is more challenging than on-road scenarios because of the blurred boundaries of traversable areas.<n>Previous state-of-the-art (SOTA) methods employ multi-modal fusion of RGB images and LiDAR data.<n>This paper presents a novel RGB-only approach for off-road freespace detection, named ROD, eliminating the reliance on LiDAR data.
Score: 15.62982857392494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Off-road freespace detection is more challenging than on-road scenarios because of the blurred boundaries of traversable areas. Previous state-of-the-art (SOTA) methods employ multi-modal fusion of RGB images and LiDAR data. However, due to the significant increase in inference time when calculating surface normal maps from LiDAR data, multi-modal methods are not suitable for real-time applications, particularly in real-world scenarios where higher FPS is required compared to slow navigation. This paper presents a novel RGB-only approach for off-road freespace detection, named ROD, eliminating the reliance on LiDAR data and its computational demands. Specifically, we utilize a pre-trained Vision Transformer (ViT) to extract rich features from RGB images. Additionally, we design a lightweight yet efficient decoder, which together improve both precision and inference speed. ROD establishes a new SOTA on ORFD and RELLIS-3D datasets, as well as an inference speed of 50 FPS, significantly outperforming prior models.

Related papers

Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm [103.36490810025752]
Existing multi-modal object tracking approaches primarily focus on dual-modal paradigms, such as RGB-Depth or RGB-Thermal.<n>This work introduces a novel multi-modal tracking task that leverages three complementary modalities, including visible RGB, Depth (D), and Thermal Infrared (TIR)<n>We propose a novel multi-modal tracker, dubbed RDTTrack, which integrates tri-modal information for robust tracking by leveraging a pretrained RGB-only tracking model.
arXiv Detail & Related papers (2025-09-29T13:05:15Z)
Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion [11.351728925952193]
Conventional LiDAR sensors perform dense, stateless scans, ignoring the strong temporal continuity in real-world scenes.<n>We propose a predictive, history-aware adaptive scanning framework that anticipates informative regions of interest based on past observations.<n>Our method significantly reduces unnecessary data acquisition by concentrating dense LiDAR scanning only within these ROIs and sparsely sampling elsewhere.
arXiv Detail & Related papers (2025-08-03T03:20:36Z)
RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions [0.3141085922386211]
Short-wave infrared (SWIR) imaging offers several advantages over NIR and LWIR.<n>Current autonomous driving algorithms heavily rely on the visible spectrum, which is prone to performance degradation in adverse conditions.<n>We introduce the RGB and SWIR Multispectral Driving dataset, which comprises 100,000 synchronized and spatially aligned RGB-SWIR image pairs.
arXiv Detail & Related papers (2025-04-10T09:54:57Z)
Bringing RGB and IR Together: Hierarchical Multi-Modal Enhancement for Robust Transmission Line Detection [67.02804741856512]
We propose a novel Hierarchical Multi-Modal Enhancement Network (HMMEN) that integrates RGB and IR data for robust and accurate TL detection.<n>Our method introduces two key components: (1) a Mutual Multi-Modal Enhanced Block (MMEB), which fuses and enhances hierarchical RGB and IR feature maps in a coarse-to-fine manner, and (2) a Feature Alignment Block (FAB) that corrects misalignments between decoder outputs and IR feature maps by leveraging deformable convolutions.
arXiv Detail & Related papers (2025-01-25T06:21:06Z)
IRisPath: Enhancing Costmap for Off-Road Navigation with Robust IR-RGB Fusion for Improved Day and Night Traversability [2.21687743334279]
Traditional on-road autonomous methods struggle with dynamic terrains, leading to poor vehicle control in off-road conditions.<n>Recent deep-learning models have used perception sensors along with kinesthetic feedback for navigation on such terrains.<n>We propose a multi modal fusion network "IRisPath" capable of using Thermal and RGB images to provide robustness against dynamic weather and light conditions.
arXiv Detail & Related papers (2024-12-04T09:53:09Z)
Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB [12.38882701862349]
3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning.<n> RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes.<n>We propose using an alternative class of "blurred" LiDAR that emits a diffuse flash.
arXiv Detail & Related papers (2024-11-29T05:01:23Z)
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z)
Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences [12.799443250845224]
We propose a novel time-pose function, which is an implicit network that maps timestamps to $rm SE(3)$ elements. Our algorithm consists of three steps: (1) time-pose function fitting, (2) radiance field bootstrapping, (3) joint pose error compensation and radiance field refinement. We also show qualitatively improved results on a real-world asynchronous RGB-D sequence captured by drone.
arXiv Detail & Related papers (2022-11-14T15:37:27Z)
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications. We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation. Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z)
Pseudo-LiDAR Based Road Detection [5.9106199000537645]
We propose a novel road detection approach with RGB being the only input during inference. We exploit pseudo-LiDAR using depth estimation, and propose a feature fusion network where RGB and learned depth information are fused. The proposed method achieves state-of-the-art performance on two challenging benchmarks, KITTI and R2D.
arXiv Detail & Related papers (2021-07-28T11:21:42Z)
DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection [104.50425501764806]
We introduce a large-scale dataset to enable versatile applications for light field saliency detection. We present an asymmetrical two-stream model consisting of the Focal stream and RGB stream. Experiments demonstrate that our Focal stream achieves state-of-the-arts performance.
arXiv Detail & Related papers (2020-12-30T11:53:27Z)
Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios. We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z)
Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image. We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle. Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.