Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction
- URL: http://arxiv.org/abs/2111.15119v1
- Date: Tue, 30 Nov 2021 04:30:10 GMT
- Title: Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction
- Authors: Lingbo Liu and Zewei Yang and Guanbin Li and Kuo Wang and Tianshui
Chen and Liang Lin
- Abstract summary: We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
- Score: 110.61383502442598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Land remote sensing analysis is a crucial research in earth science. In this
work, we focus on a challenging task of land analysis, i.e., automatic
extraction of traffic roads from remote sensing data, which has widespread
applications in urban development and expansion estimation. Nevertheless,
conventional methods either only utilized the limited information of aerial
images, or simply fused multimodal information (e.g., vehicle trajectories),
thus cannot well recognize unconstrained roads. To facilitate this problem, we
introduce a novel neural network framework termed Cross-Modal Message
Propagation Network (CMMPNet), which fully benefits the complementary different
modal data (i.e., aerial images and crowdsourced trajectories). Specifically,
CMMPNet is composed of two deep Auto-Encoders for modality-specific
representation learning and a tailor-designed Dual Enhancement Module for
cross-modal representation refinement. In particular, the complementary
information of each modality is comprehensively extracted and dynamically
propagated to enhance the representation of another modality. Extensive
experiments on three real-world benchmarks demonstrate the effectiveness of our
CMMPNet for robust road extraction benefiting from blending different modal
data, either using image and trajectory data or image and Lidar data. From the
experimental results, we observe that the proposed approach outperforms current
state-of-the-art methods by large margins.
Related papers
- Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models [27.316692263196277]
MVTraj is a novel multi-view modeling method for trajectory representation learning.
It integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data.
Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views.
arXiv Detail & Related papers (2024-10-17T03:56:12Z) - Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition [49.20086587208214]
We propose a new strategy called think twice before recognizing to improve fine-grained traffic sign recognition (TSR)
Our strategy achieves effective fine-grained TSR by stimulating the multiple-thinking capability of large multimodal models (LMM)
arXiv Detail & Related papers (2024-09-03T02:08:47Z) - More Than Routing: Joint GPS and Route Modeling for Refine Trajectory
Representation Learning [26.630640299709114]
We propose Joint GPS and Route Modelling based on self-supervised technology, namely JGRM.
We develop two encoders, each tailored to capture representations of route and GPS trajectories respectively.
The representations from the two modalities are fed into a shared transformer for inter-modal information interaction.
arXiv Detail & Related papers (2024-02-25T18:27:25Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Road detection via a dual-task network based on cross-layer graph fusion
modules [2.8197257696982287]
We propose a dual-task network (DTnet) for road detection and cross-layer graph fusion module (CGM)
CGM improves the cross-layer fusion effect by a complex feature stream graph, and four graph patterns are evaluated.
arXiv Detail & Related papers (2022-08-17T07:16:55Z) - DouFu: A Double Fusion Joint Learning Method For Driving Trajectory
Representation [13.321587117066166]
We propose a novel multimodal fusion model, DouFu, for trajectory representation joint learning.
We first design movement, route, and global features generated from the trajectory data and urban functional zones.
With the global semantic feature, DouFu produces a comprehensive embedding for each trajectory.
arXiv Detail & Related papers (2022-05-05T07:43:35Z) - Road Network Guided Fine-Grained Urban Traffic Flow Inference [108.64631590347352]
Accurate inference of fine-grained traffic flow from coarse-grained one is an emerging yet crucial problem.
We propose a novel Road-Aware Traffic Flow Magnifier (RATFM) that exploits the prior knowledge of road networks.
Our method can generate high-quality fine-grained traffic flow maps.
arXiv Detail & Related papers (2021-09-29T07:51:49Z) - Scribble-based Weakly Supervised Deep Learning for Road Surface
Extraction from Remote Sensing Images [7.1577508803778045]
We propose a scribble-based weakly supervised road surface extraction method named ScRoadExtractor.
To propagate semantic information from sparse scribbles to unlabeled pixels, we introduce a road label propagation algorithm.
The proposal masks generated from the road label propagation algorithm are utilized to train a dual-branch encoder-decoder network.
arXiv Detail & Related papers (2020-10-25T12:40:30Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.