Related papers: TOMD: A Trail-based Off-road Multimodal Dataset for Traversable Pathway Segmentation under Challenging Illumination Conditions

TOMD: A Trail-based Off-road Multimodal Dataset for Traversable Pathway Segmentation under Challenging Illumination Conditions

URL: http://arxiv.org/abs/2506.21630v1
Date: Tue, 24 Jun 2025 23:58:44 GMT
Title: TOMD: A Trail-based Off-road Multimodal Dataset for Traversable Pathway Segmentation under Challenging Illumination Conditions
Authors: Yixin Sun, Li Li, Wenke E, Amir Atapour-Abarghouei, Toby P. Breckon,
Abstract summary: We introduce the Trail-based Off-road Multimodal dataset (TOMD), a comprehensive dataset specifically designed for such environments.<n>TOMD features high-fidelity multimodal sensor data -- including 128-channel LiDAR, stereo imagery, IMU, and illumination measurements.<n>We also propose a dynamic multiscale data fusion model for accurate traversable pathway prediction.
Score: 17.019567722324666
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting traversable pathways in unstructured outdoor environments remains a significant challenge for autonomous robots, especially in critical applications such as wide-area search and rescue, as well as incident management scenarios like forest fires. Existing datasets and models primarily target urban settings or wide, vehicle-traversable off-road tracks, leaving a substantial gap in addressing the complexity of narrow, trail-like off-road scenarios. To address this, we introduce the Trail-based Off-road Multimodal Dataset (TOMD), a comprehensive dataset specifically designed for such environments. TOMD features high-fidelity multimodal sensor data -- including 128-channel LiDAR, stereo imagery, GNSS, IMU, and illumination measurements -- collected through repeated traversals under diverse conditions. We also propose a dynamic multiscale data fusion model for accurate traversable pathway prediction. The study analyzes the performance of early, cross, and mixed fusion strategies under varying illumination levels. Results demonstrate the effectiveness of our approach and the relevance of illumination in segmentation performance. We publicly release TOMD at https://github.com/yyyxs1125/TMOD to support future research in trail-based off-road navigation.

Related papers

PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments [73.80718037070773]
We present the multi-modal Pedestrian-Focused Scene dataset, rigorously annotated in semi-structured scenes with the format of nuScenes.<n>We also propose a novel Hybrid Multi-Scale Fusion Network (HMFN) to detect pedestrians in densely populated and occluded scenarios.
arXiv Detail & Related papers (2025-02-21T09:57:53Z)
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving [17.531603453254434]
ROAD-Waymo is an extensive dataset for the development and benchmarking of techniques for agent, action, location and event detection in road scenes. Considerably larger and more challenging than any existing dataset (and encompassing multiple cities), it comes with 198k annotated video frames, 54k agent tubes, 3.9M bounding boxes and a total of 12.4M labels.
arXiv Detail & Related papers (2024-11-03T20:46:50Z)
UdeerLID+: Integrating LiDAR, Image, and Relative Depth with Semi-Supervised [12.440461420762265]
Road segmentation is a critical task for autonomous driving systems. Our work introduces an innovative approach that integrates LiDAR point cloud data, visual image, and relative depth maps. One of the primary challenges is the scarcity of large-scale, accurately labeled datasets.
arXiv Detail & Related papers (2024-09-10T03:57:30Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving [67.09546127265034]
Road surface reconstruction helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction dataset, a real-world, high-resolution, and high-precision dataset collected with a specialized platform in diverse driving conditions. It covers common road types containing approximately 16,000 pairs of stereo images, original point clouds, and ground-truth depth/disparity maps.
arXiv Detail & Related papers (2023-10-03T17:59:32Z)
Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents [49.904531485843464]
In this paper, we discuss the main challenge: insufficient, or even no, labeled data for real-world indoor environments. We describe MMISM (Multi-modality input Multi-task output Indoor Scene understanding Model) to tackle the above challenges. MMISM considers RGB images as well as sparse Lidar points as inputs and 3D object detection, depth completion, human pose estimation, and semantic segmentation as output tasks. We show that MMISM performs on par or even better than single-task models.
arXiv Detail & Related papers (2022-09-27T04:49:19Z)
Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet) CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement. Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z)
Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z)
RELLIS-3D Dataset: Data, Benchmarks and Analysis [16.803548871633957]
RELLIS-3D is a multimodal dataset collected in an off-road environment. The data was collected on the Rellis Campus of Texas A&M University.
arXiv Detail & Related papers (2020-11-17T18:28:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.