Dynamic Fusion Module Evolves Drivable Area and Road Anomaly Detection:
A Benchmark and Algorithms
- URL: http://arxiv.org/abs/2103.02433v2
- Date: Thu, 4 Mar 2021 06:01:08 GMT
- Title: Dynamic Fusion Module Evolves Drivable Area and Road Anomaly Detection:
A Benchmark and Algorithms
- Authors: Hengli Wang, Rui Fan, Yuxiang Sun, Ming Liu
- Abstract summary: Joint detection of drivable areas and road anomalies is very important for mobile robots.
In this paper, we first build a drivable area and road anomaly detection benchmark for ground mobile robots.
We propose a novel module, referred to as the dynamic fusion module (DFM), which can be easily deployed in existing data-fusion networks.
- Score: 16.417299198546168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Joint detection of drivable areas and road anomalies is very important for
mobile robots. Recently, many semantic segmentation approaches based on
convolutional neural networks (CNNs) have been proposed for pixel-wise drivable
area and road anomaly detection. In addition, some benchmark datasets, such as
KITTI and Cityscapes, have been widely used. However, the existing benchmarks
are mostly designed for self-driving cars. There lacks a benchmark for ground
mobile robots, such as robotic wheelchairs. Therefore, in this paper, we first
build a drivable area and road anomaly detection benchmark for ground mobile
robots, evaluating the existing state-of-the-art single-modal and data-fusion
semantic segmentation CNNs using six modalities of visual features.
Furthermore, we propose a novel module, referred to as the dynamic fusion
module (DFM), which can be easily deployed in existing data-fusion networks to
fuse different types of visual features effectively and efficiently. The
experimental results show that the transformed disparity image is the most
informative visual feature and the proposed DFM-RTFNet outperforms the
state-of-the-arts. Additionally, our DFM-RTFNet achieves competitive
performance on the KITTI road benchmark. Our benchmark is publicly available at
https://sites.google.com/view/gmrb.
Related papers
- G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving [71.9040410238973]
We focus on inferring the ego trajectory of a driver's vehicle using their gaze data.
Next, we develop G-MEMP, a novel multimodal ego-trajectory prediction network that combines GPS and video input with gaze data.
The results show that G-MEMP significantly outperforms state-of-the-art methods in both benchmarks.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Detection-segmentation convolutional neural network for autonomous
vehicle perception [0.0]
Object detection and segmentation are two core modules of an autonomous vehicle perception system.
Currently, the most commonly used algorithms are based on deep neural networks, which guarantee high efficiency but require high-performance computing platforms.
A reduction in the complexity of the network can be achieved by using an appropriate architecture, representation, and computing platform.
arXiv Detail & Related papers (2023-06-30T08:54:52Z) - FollowNet: A Comprehensive Benchmark for Car-Following Behavior Modeling [20.784555362703294]
We establish a public benchmark dataset for car-following behavior modeling.
The benchmark consists of more than 80K car-following events extracted from five public driving datasets.
Results show that the deep deterministic policy gradient (DDPG) based model performs competitively with a lower MSE for spacing.
arXiv Detail & Related papers (2023-05-25T08:59:26Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - Monocular Vision-based Prediction of Cut-in Maneuvers with LSTM Networks [0.0]
This study proposes a method to predict potentially dangerous cut-in maneuvers happening in the ego lane.
We follow a computer vision-based approach that only employs a single in-vehicle RGB camera.
Our algorithm consists of a CNN-based vehicle detection and tracking step and an LSTM-based maneuver classification step.
arXiv Detail & Related papers (2022-03-21T02:30:36Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Applying Surface Normal Information in Drivable Area and Road Anomaly
Detection for Ground Mobile Robots [29.285200656398562]
We develop a novel module named the Normal Inference Module (NIM), which can generate surface normal information from dense depth images with high accuracy and efficiency.
Our NIM can be deployed in existing convolutional neural networks (CNNs) to refine the segmentation performance.
Our proposed NIM-RTFNet ranks 8th on the KITTI road benchmark and exhibits a real-time inference speed.
arXiv Detail & Related papers (2020-08-26T05:44:07Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - SUPER: A Novel Lane Detection System [26.417172945374364]
We propose a real-time lane detection system, called Scene Understanding Physics-Enhanced Real-time (SUPER) algorithm.
We train the proposed system using heterogeneous data from Cityscapes, Vistas and Apollo, and evaluate the performance on four completely separate datasets.
Preliminary test results show promising real-time lane-detection performance compared with the Mobileye.
arXiv Detail & Related papers (2020-05-14T21:40:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.