ROD: RGB-Only Fast and Efficient Off-road Freespace Detection
- URL: http://arxiv.org/abs/2508.08697v1
- Date: Tue, 12 Aug 2025 07:41:20 GMT
- Title: ROD: RGB-Only Fast and Efficient Off-road Freespace Detection
- Authors: Tong Sun, Hongliang Ye, Jilin Mei, Liang Chen, Fangzhou Zhao, Leiqiang Zong, Yu Hu,
- Abstract summary: Off-road freespace detection is more challenging than on-road scenarios because of the blurred boundaries of traversable areas.<n>Previous state-of-the-art (SOTA) methods employ multi-modal fusion of RGB images and LiDAR data.<n>This paper presents a novel RGB-only approach for off-road freespace detection, named ROD, eliminating the reliance on LiDAR data.
- Score: 15.62982857392494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Off-road freespace detection is more challenging than on-road scenarios because of the blurred boundaries of traversable areas. Previous state-of-the-art (SOTA) methods employ multi-modal fusion of RGB images and LiDAR data. However, due to the significant increase in inference time when calculating surface normal maps from LiDAR data, multi-modal methods are not suitable for real-time applications, particularly in real-world scenarios where higher FPS is required compared to slow navigation. This paper presents a novel RGB-only approach for off-road freespace detection, named ROD, eliminating the reliance on LiDAR data and its computational demands. Specifically, we utilize a pre-trained Vision Transformer (ViT) to extract rich features from RGB images. Additionally, we design a lightweight yet efficient decoder, which together improve both precision and inference speed. ROD establishes a new SOTA on ORFD and RELLIS-3D datasets, as well as an inference speed of 50 FPS, significantly outperforming prior models.
Related papers
- Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm [103.36490810025752]
Existing multi-modal object tracking approaches primarily focus on dual-modal paradigms, such as RGB-Depth or RGB-Thermal.<n>This work introduces a novel multi-modal tracking task that leverages three complementary modalities, including visible RGB, Depth (D), and Thermal Infrared (TIR)<n>We propose a novel multi-modal tracker, dubbed RDTTrack, which integrates tri-modal information for robust tracking by leveraging a pretrained RGB-only tracking model.
arXiv Detail & Related papers (2025-09-29T13:05:15Z) - Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion [11.351728925952193]
Conventional LiDAR sensors perform dense, stateless scans, ignoring the strong temporal continuity in real-world scenes.<n>We propose a predictive, history-aware adaptive scanning framework that anticipates informative regions of interest based on past observations.<n>Our method significantly reduces unnecessary data acquisition by concentrating dense LiDAR scanning only within these ROIs and sparsely sampling elsewhere.
arXiv Detail & Related papers (2025-08-03T03:20:36Z) - RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions [0.3141085922386211]
Short-wave infrared (SWIR) imaging offers several advantages over NIR and LWIR.<n>Current autonomous driving algorithms heavily rely on the visible spectrum, which is prone to performance degradation in adverse conditions.<n>We introduce the RGB and SWIR Multispectral Driving dataset, which comprises 100,000 synchronized and spatially aligned RGB-SWIR image pairs.
arXiv Detail & Related papers (2025-04-10T09:54:57Z) - Bringing RGB and IR Together: Hierarchical Multi-Modal Enhancement for Robust Transmission Line Detection [67.02804741856512]
We propose a novel Hierarchical Multi-Modal Enhancement Network (HMMEN) that integrates RGB and IR data for robust and accurate TL detection.<n>Our method introduces two key components: (1) a Mutual Multi-Modal Enhanced Block (MMEB), which fuses and enhances hierarchical RGB and IR feature maps in a coarse-to-fine manner, and (2) a Feature Alignment Block (FAB) that corrects misalignments between decoder outputs and IR feature maps by leveraging deformable convolutions.
arXiv Detail & Related papers (2025-01-25T06:21:06Z) - IRisPath: Enhancing Costmap for Off-Road Navigation with Robust IR-RGB Fusion for Improved Day and Night Traversability [2.21687743334279]
Traditional on-road autonomous methods struggle with dynamic terrains, leading to poor vehicle control in off-road conditions.<n>Recent deep-learning models have used perception sensors along with kinesthetic feedback for navigation on such terrains.<n>We propose a multi modal fusion network "IRisPath" capable of using Thermal and RGB images to provide robustness against dynamic weather and light conditions.
arXiv Detail & Related papers (2024-12-04T09:53:09Z) - Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB [12.38882701862349]
3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning.<n> RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes.<n>We propose using an alternative class of "blurred" LiDAR that emits a diffuse flash.
arXiv Detail & Related papers (2024-11-29T05:01:23Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - Self-Aligning Depth-regularized Radiance Fields for Asynchronous RGB-D Sequences [12.799443250845224]
We propose a novel time-pose function, which is an implicit network that maps timestamps to $rm SE(3)$ elements.
Our algorithm consists of three steps: (1) time-pose function fitting, (2) radiance field bootstrapping, (3) joint pose error compensation and radiance field refinement.
We also show qualitatively improved results on a real-world asynchronous RGB-D sequence captured by drone.
arXiv Detail & Related papers (2022-11-14T15:37:27Z) - LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR
Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications.
We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation.
Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z) - Pseudo-LiDAR Based Road Detection [5.9106199000537645]
We propose a novel road detection approach with RGB being the only input during inference.
We exploit pseudo-LiDAR using depth estimation, and propose a feature fusion network where RGB and learned depth information are fused.
The proposed method achieves state-of-the-art performance on two challenging benchmarks, KITTI and R2D.
arXiv Detail & Related papers (2021-07-28T11:21:42Z) - DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection [104.50425501764806]
We introduce a large-scale dataset to enable versatile applications for light field saliency detection.
We present an asymmetrical two-stream model consisting of the Focal stream and RGB stream.
Experiments demonstrate that our Focal stream achieves state-of-the-arts performance.
arXiv Detail & Related papers (2020-12-30T11:53:27Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.