MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
- URL: http://arxiv.org/abs/2404.00876v1
- Date: Mon, 1 Apr 2024 03:13:32 GMT
- Title: MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
- Authors: Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu,
- Abstract summary: We propose MGMap, a mask-guided approach that effectively highlights the informative regions and achieves precise map element localization.
Specifically, MGMap employs learned masks based on the enhanced multi-scale BEV features from two perspectives.
Compared to the baselines, our proposed MGMap achieves a notable improvement of around 10 mAP for different input modalities.
- Score: 17.16231247910372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Currently, high-definition (HD) map construction leans towards a lightweight online generation tendency, which aims to preserve timely and reliable road scene information. However, map elements contain strong shape priors. Subtle and sparse annotations make current detection-based frameworks ambiguous in locating relevant feature scopes and cause the loss of detailed structures in prediction. To alleviate these problems, we propose MGMap, a mask-guided approach that effectively highlights the informative regions and achieves precise map element localization by introducing the learned masks. Specifically, MGMap employs learned masks based on the enhanced multi-scale BEV features from two perspectives. At the instance level, we propose the Mask-activated instance (MAI) decoder, which incorporates global instance and structural information into instance queries by the activation of instance masks. At the point level, a novel position-guided mask patch refinement (PG-MPR) module is designed to refine point locations from a finer-grained perspective, enabling the extraction of point-specific patch information. Compared to the baselines, our proposed MGMap achieves a notable improvement of around 10 mAP for different input modalities. Extensive experiments also demonstrate that our approach showcases strong robustness and generalization capabilities. Our code can be found at https://github.com/xiaolul2/MGMap.
Related papers
- MaskInversion: Localized Embeddings via Optimization of Explainability Maps [49.50785637749757]
MaskInversion generates a context-aware embedding for a query image region specified by a mask at test time.
It can be used for a broad range of tasks, including open-vocabulary class retrieval, referring expression comprehension, as well as for localized captioning and image generation.
arXiv Detail & Related papers (2024-07-29T14:21:07Z) - Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks [9.113769643415868]
We introduce Mask2Map, a novel end-to-end online HD map construction method designed for autonomous driving applications.
Our approach focuses on predicting the class and ordered point set of map instances within a scene.
Mask2Map achieves remarkable performance improvements over previous state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T13:48:52Z) - Downstream Task Guided Masking Learning in Masked Autoencoders Using
Multi-Level Optimization [42.82742477950748]
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning.
We introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that learns an optimal masking strategy during pretraining.
Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning.
arXiv Detail & Related papers (2024-02-28T07:37:26Z) - Background Activation Suppression for Weakly Supervised Object
Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels.
New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization.
This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z) - Hierarchical Dynamic Masks for Visual Explanation of Neural Networks [5.333582981327497]
Saliency methods generating visual explanatory maps representing the importance of image pixels for model classification is a popular technique for explaining neural network decisions.
We propose hierarchical dynamic masks (HDM), a novel explanatory maps generation method, to enhance the granularity and comprehensiveness of saliency maps.
The proposed method outperformed previous approaches significantly in terms of recognition and localization capabilities when tested on natural and medical datasets.
arXiv Detail & Related papers (2023-01-12T12:24:49Z) - GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds [72.60362979456035]
Masked Autoencoders (MAE) are challenging to explore in large-scale 3D point clouds.
We propose a textbfGenerative textbfDecoder for MAE (GD-MAE) to automatically merges the surrounding context.
We demonstrate the efficacy of the proposed method on several large-scale benchmarks: KITTI, and ONCE.
arXiv Detail & Related papers (2022-12-06T14:32:55Z) - Masked Surfel Prediction for Self-Supervised Point Cloud Learning [40.16043026141161]
We make the first attempt to consider the local geometry information explicitly into the masked auto-encoding, and propose a novel Masked Surfel Prediction (MaskSurf) method.
Specifically, given the input point cloud masked at a high ratio, we learn a transformer-based encoder-decoder network to estimate the underlying masked surfels.
MaskSurf is validated on six downstream tasks under three fine-tuning strategies.
arXiv Detail & Related papers (2022-07-07T06:47:26Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z) - Spatiotemporal Graph Neural Network based Mask Reconstruction for Video
Object Segmentation [70.97625552643493]
This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting.
We propose a novel graph neuralS network (TG-Net) which captures the local contexts by utilizing all proposals.
arXiv Detail & Related papers (2020-12-10T07:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.