Related papers: PseudoMapTrainer: Learning Online Mapping without HD Maps

PseudoMapTrainer: Learning Online Mapping without HD Maps

URL: http://arxiv.org/abs/2508.18788v1
Date: Tue, 26 Aug 2025 08:13:30 GMT
Title: PseudoMapTrainer: Learning Online Mapping without HD Maps
Authors: Christian Löwens, Thorben Funke, Jingchao Xie, Alexandru Paul Condurache,
Abstract summary: PseudoMapTrainer is a novel approach to online mapping that uses pseudo-labels generated from unlabeled sensor data.<n>We derive those pseudo-labels by reconstructing the road surface from multi-camera imagery.<n>Our pseudo-labels can be effectively used to pre-train an online model in a semi-supervised manner.
Score: 41.789167930501016
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online mapping models show remarkable results in predicting vectorized maps from multi-view camera images only. However, all existing approaches still rely on ground-truth high-definition maps during training, which are expensive to obtain and often not geographically diverse enough for reliable generalization. In this work, we propose PseudoMapTrainer, a novel approach to online mapping that uses pseudo-labels generated from unlabeled sensor data. We derive those pseudo-labels by reconstructing the road surface from multi-camera imagery using Gaussian splatting and semantics of a pre-trained 2D segmentation network. In addition, we introduce a mask-aware assignment algorithm and loss function to handle partially masked pseudo-labels, allowing for the first time the training of online mapping models without any ground-truth maps. Furthermore, our pseudo-labels can be effectively used to pre-train an online model in a semi-supervised manner to leverage large-scale unlabeled crowdsourced data. The code is available at github.com/boschresearch/PseudoMapTrainer.

Related papers

MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training [6.6099504578472414]
MapRF is a weakly supervised framework that learns to construct 3D maps using only 2D image labels.<n>To mitigate error accumulation during self-training, we propose a Map-to-Ray Matching strategy.
arXiv Detail & Related papers (2025-11-24T07:23:10Z)
Semantic Segmentation for Sequential Historical Maps by Learning from Only One Map [0.4915744683251151]
We propose an automated approach to digitization using deep-learning-based semantic segmentation.<n>A key challenge in this process is the lack of ground-truth annotations required for training deep neural networks.<n>We introduce a weakly-supervised age-tracing strategy for model fine-tuning.
arXiv Detail & Related papers (2025-01-03T14:55:22Z)
Neural Semantic Surface Maps [52.61017226479506]
We present an automated technique for computing a map between two genus-zero shapes, which matches semantically corresponding regions to one another. Our approach can generate semantic surface-to-surface maps, eliminating manual annotations or any 3D training data requirement.
arXiv Detail & Related papers (2023-09-09T16:21:56Z)
SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding [57.108301842535894]
We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images. We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images. SNAP can resolve the location of challenging image queries beyond the reach of traditional methods.
arXiv Detail & Related papers (2023-06-08T17:54:47Z)
Is Cross-modal Information Retrieval Possible without Training? [4.616703548353372]
We take a simple mapping computed from the least squares and singular value decomposition (SVD) for a solution to the Procrustes problem. That is, given information in one modality such as text, the mapping helps us locate a semantically equivalent data item in another modality such as image. Using off-the-shelf pretrained deep learning models, we have experimented the aforementioned simple cross-modal mappings in tasks of text-to-image and image-to-text retrieval.
arXiv Detail & Related papers (2023-04-20T02:36:18Z)
Sketch-Guided Text-to-Image Diffusion Models [57.12095262189362]
We introduce a universal approach to guide a pretrained text-to-image diffusion model. Our method does not require to train a dedicated model or a specialized encoder for the task. We take a particular focus on the sketch-to-image translation task, revealing a robust and expressive way to generate images.
arXiv Detail & Related papers (2022-11-24T18:45:32Z)
3SD: Self-Supervised Saliency Detection With No Labels [19.260185488168982]
We present a conceptually simple self-supervised method for saliency detection. Our method generates and uses pseudo-ground truth labels for training.
arXiv Detail & Related papers (2022-03-09T01:40:28Z)
One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object. We iteratively conduct the training and label propagation, facilitated by a graph propagation module. Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z)
Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.