Related papers: DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model

DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model

URL: http://arxiv.org/abs/2405.02008v1
Date: Fri, 3 May 2024 11:16:27 GMT
Title: DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model
Authors: Peijin Jia, Tuopu Wen, Ziang Luo, Mengmeng Yang, Kun Jiang, Zhiquan Lei, Xuewei Tang, Ziyuan Liu, Le Cui, Kehua Sheng, Bo Zhang, Diange Yang,
Abstract summary: We propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks. By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced. Our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts.
Score: 13.359878206781044
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Constructing high-definition (HD) maps is a crucial requirement for enabling autonomous driving. In recent years, several map segmentation algorithms have been developed to address this need, leveraging advancements in Bird's-Eye View (BEV) perception. However, existing models still encounter challenges in producing realistic and consistent semantic map layouts. One prominent issue is the limited utilization of structured priors inherent in map segmentation masks. In light of this, we propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks using latent diffusion model. By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced and certain structural errors present in the segmentation outputs can be effectively rectified. Notably, the proposed module can be seamlessly integrated into any map segmentation model, thereby augmenting its capability to accurately delineate semantic information. Furthermore, through extensive visualization analysis, our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts, further validating its efficacy in improving the quality of the generated maps.

Related papers

SMOL-MapSeg: Show Me One Label [0.4499833362998489]
We show that SMOL-MapSeg can accurately segment classes defined by OND knowledge.<n>It can also adapt to unseen classes through few-shot fine-tuning.<n>It outperforms a UNet-based baseline in average segmentation performance.
arXiv Detail & Related papers (2025-08-07T15:36:17Z)
DiffuMatch: Category-Agnostic Spectral Diffusion Priors for Robust Non-rigid Shape Matching [53.39693288324375]
We show that both in-network regularization and functional map training can be replaced with data-driven methods.<n>We first train a generative model of functional maps in the spectral domain using score-based generative modeling.<n>We then exploit the resulting model to promote the structural properties of ground truth functional maps on new shape collections.
arXiv Detail & Related papers (2025-07-31T16:44:54Z)
Bridging Scales in Map Generation: A scale-aware cascaded generative mapping framework for seamless and consistent multi-scale cartographic representation [2.414525855161937]
Multi-scale tile maps are essential for geographic information services, serving as fundamental outcomes of surveying and cartographic.<n>Current approaches face two fundamental challenges: inadequate integration of cartographic generalization principles with dynamic multi-scale generation and spatial discontinuities arising from tile-wise generation.<n>We propose a scale-aware cartographic generation framework (SCGM) that leverages conditional guided diffusion and a multi-scale cascade architecture.
arXiv Detail & Related papers (2025-02-07T15:11:31Z)
TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps) We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z)
MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps [6.414068793245697]
We introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for historical map segmentation tasks. Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder. We develop an automatic prompt generation process, eliminating the need for manual input.
arXiv Detail & Related papers (2024-11-11T13:18:45Z)
MaskInversion: Localized Embeddings via Optimization of Explainability Maps [49.50785637749757]
MaskInversion generates a context-aware embedding for a query image region specified by a mask at test time. It can be used for a broad range of tasks, including open-vocabulary class retrieval, referring expression comprehension, as well as for localized captioning and image generation.
arXiv Detail & Related papers (2024-07-29T14:21:07Z)
ADMap: Anti-disturbance framework for reconstructing online vectorized HD map [9.218463154577616]
This paper proposes the Anti-disturbance Map reconstruction framework (ADMap) To mitigate point-order jitter, the framework consists of three modules: Multi-Scale Perception Neck, Instance Interactive Attention (IIA), and Vector Direction Difference Loss (VDDL)
arXiv Detail & Related papers (2024-01-24T01:37:27Z)
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models [52.3015009878545]
We develop an image segmentor capable of generating fine-grained segmentation maps without any additional training. Our framework identifies semantic correspondences between image pixels and spatial locations of low-dimensional feature maps. In extensive experiments, the produced segmentation maps are demonstrated to be well delineated and capture detailed parts of the images.
arXiv Detail & Related papers (2024-01-22T07:34:06Z)
Stochastic Segmentation with Conditional Categorical Diffusion Models [3.8168879948759953]
We propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models. Our results show that CCDM achieves state-of-the-art performance on LIDC, and outperforms established baselines on the classical segmentation dataset Cityscapes.
arXiv Detail & Related papers (2023-03-15T19:16:47Z)
BEVBert: Multimodal Map Pre-training for Language-guided Navigation [75.23388288113817]
We propose a new map-based pre-training paradigm that is spatial-aware for use in vision-and-language navigation (VLN) We build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map. Based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal.
arXiv Detail & Related papers (2022-12-08T16:27:54Z)
SegDiff: Image Segmentation with Diffusion Probabilistic Models [81.16986859755038]
Diffusion Probabilistic Methods are employed for state-of-the-art image generation. We present a method for extending such models for performing image segmentation. The method learns end-to-end, without relying on a pre-trained backbone.
arXiv Detail & Related papers (2021-12-01T10:17:25Z)
Transformer-based Map Matching Model with Limited Ground-Truth Data using Transfer-Learning Approach [6.510061176722248]
In many trajectory-based applications, it is necessary to map raw GPS trajectories onto road networks in digital maps. In this paper, we consider the map-matching task from the data perspective, proposing a deep learning-based map-matching model. We generate synthetic trajectory data to pre-train the Transformer model and then fine-tune the model with a limited number of ground-truth data.
arXiv Detail & Related papers (2021-08-01T11:51:11Z)
CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.