Jointly Learning Representations for Map Entities via Heterogeneous
Graph Contrastive Learning
- URL: http://arxiv.org/abs/2402.06135v1
- Date: Fri, 9 Feb 2024 01:47:18 GMT
- Title: Jointly Learning Representations for Map Entities via Heterogeneous
Graph Contrastive Learning
- Authors: Jiawei Jiang, Yifan Yang, Jingyuan Wang, Junjie Wu
- Abstract summary: We propose a novel method named HOME-GCL for learning representations of multiple categories of map entities.
Our approach utilizes a heterogeneous map entity graph (HOME graph) that integrates both road segments and land parcels into a unified framework.
To the best of our knowledge, HOME-GCL is the first attempt to jointly learn representations for road segments and land parcels using a unified model.
- Score: 38.415692986360995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The electronic map plays a crucial role in geographic information systems,
serving various urban managerial scenarios and daily life services. Developing
effective Map Entity Representation Learning (MERL) methods is crucial to
extracting embedding information from electronic maps and converting map
entities into representation vectors for downstream applications. However,
existing MERL methods typically focus on one specific category of map entities,
such as POIs, road segments, or land parcels, which is insufficient for
real-world diverse map-based applications and might lose latent structural and
semantic information interacting between entities of different types. Moreover,
using representations generated by separate models for different map entities
can introduce inconsistencies. Motivated by this, we propose a novel method
named HOME-GCL for learning representations of multiple categories of map
entities. Our approach utilizes a heterogeneous map entity graph (HOME graph)
that integrates both road segments and land parcels into a unified framework. A
HOME encoder with parcel-segment joint feature encoding and heterogeneous graph
transformer is then deliberately designed to convert segments and parcels into
representation vectors. Moreover, we introduce two types of contrastive
learning tasks, namely intra-entity and inter-entity tasks, to train the
encoder in a self-supervised manner. Extensive experiments on three large-scale
datasets covering road segment-based, land parcel-based, and trajectory-based
tasks demonstrate the superiority of our approach. To the best of our
knowledge, HOME-GCL is the first attempt to jointly learn representations for
road segments and land parcels using a unified model.
Related papers
- Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models [27.316692263196277]
MVTraj is a novel multi-view modeling method for trajectory representation learning.
It integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data.
Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views.
arXiv Detail & Related papers (2024-10-17T03:56:12Z) - MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction [75.93907511203317]
We propose MGMapNet (Multi-Granularity Map Network) to model map element with a multi-granularity representation.
The proposed MGMapNet achieves state-of-the-art performance, surpassing MapTRv2 by 5.3 mAP on nuScenes and 4.4 mAP on Argoverse2 respectively.
arXiv Detail & Related papers (2024-10-10T09:05:23Z) - HPix: Generating Vector Maps from Satellite Images [0.0]
We propose a novel method called HPix, which utilizes modified Generative Adversarial Networks (GANs) to generate vector tile map from satellite images.
Through empirical evaluations, our proposed approach showcases its effectiveness in producing highly accurate and visually captivating vector tile maps.
We further extend our study's application to include mapping of road intersections and building footprints cluster based on their area.
arXiv Detail & Related papers (2024-07-18T16:54:02Z) - LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes [2.822816116516042]
Large-scale semantic mapping is crucial for outdoor autonomous agents to fulfill high-level tasks such as planning and navigation.
This paper proposes a novel method for large-scale 3D semantic reconstruction through implicit representations from posed LiDAR measurements alone.
arXiv Detail & Related papers (2023-11-04T03:55:38Z) - Multi-label affordance mapping from egocentric vision [3.683202928838613]
We present a new approach to affordance perception which enables accurate multi-label segmentation.
Our approach can be used to automatically extract grounded affordances from first person videos.
We show how our metric representation can be exploited for build a map of interaction hotspots.
arXiv Detail & Related papers (2023-09-05T10:56:23Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - Improving Lidar-Based Semantic Segmentation of Top-View Grid Maps by
Learning Features in Complementary Representations [3.0413873719021995]
We introduce a novel way to predict semantic information from sparse, single-shot LiDAR measurements in the context of autonomous driving.
The approach is aimed specifically at improving the semantic segmentation of top-view grid maps.
For each representation a tailored deep learning architecture is developed to effectively extract semantic information.
arXiv Detail & Related papers (2022-03-02T14:49:51Z) - Learning Lane Graph Representations for Motion Forecasting [92.88572392790623]
We construct a lane graph from raw map data to preserve the map structure.
We exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor.
Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.
arXiv Detail & Related papers (2020-07-27T17:59:49Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN [117.80737222754306]
We present a novel universal object detector called Universal-RCNN.
We first generate a global semantic pool by integrating all high-level semantic representation of all the categories.
An Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN.
arXiv Detail & Related papers (2020-02-18T07:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.