StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
- URL: http://arxiv.org/abs/2407.21454v3
- Date: Wed, 25 Sep 2024 12:24:35 GMT
- Title: StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality
- Authors: Alexandra Kapp, Edith Hoffmann, Esther Weigmann, Helena Mihaljević,
- Abstract summary: We introduce StreetSurfaceVis, a novel dataset comprising 9,122 street-level images mostly from Germany.
We aim to enable robust models that maintain high accuracy across diverse image sources.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Road unevenness significantly impacts the safety and comfort of traffic participants, especially vulnerable groups such as cyclists and wheelchair users. To train models for comprehensive road surface assessments, we introduce StreetSurfaceVis, a novel dataset comprising 9,122 street-level images mostly from Germany collected from a crowdsourcing platform and manually annotated by road surface type and quality. By crafting a heterogeneous dataset, we aim to enable robust models that maintain high accuracy across diverse image sources. As the frequency distribution of road surface types and qualities is highly imbalanced, we propose a sampling strategy incorporating various external label prediction resources to ensure sufficient images per class while reducing manual annotation. More precisely, we estimate the impact of (1) enriching the image data with OpenStreetMap tags, (2) iterative training and application of a custom surface type classification model, (3) amplifying underrepresented classes through prompt-based classification with GPT-4o and (4) similarity search using image embeddings. Combining these strategies effectively reduces manual annotation workload while ensuring sufficient class representation.
Related papers
- AIDOVECL: AI-generated Dataset of Outpainted Vehicles for Eye-level Classification and Localization [0.0]
This work introduces a novel approach that leverages outpainting to address the problem of annotated data scarcity.
We apply this technique to a particularly acute challenge in autonomous driving, urban planning, and environmental monitoring.
Augmentation with outpainted vehicles improves overall performance metrics by up to 8% and enhances prediction of underrepresented classes by up to 20%.
arXiv Detail & Related papers (2024-10-31T16:46:23Z) - SurfaceAI: Automated creation of cohesive road surface quality datasets based on open street-level imagery [41.94295877935867]
SurfaceAI generates comprehensive georeferenced datasets on road surface type and quality from openly available street-level imagery.
The motivation stems from the significant impact of road unevenness on the safety and comfort of traffic participants.
arXiv Detail & Related papers (2024-09-27T17:13:25Z) - ImPoster: Text and Frequency Guidance for Subject Driven Action Personalization using Diffusion Models [55.43801602995778]
We present ImPoster, a novel algorithm for generating a target image of a'source' subject performing a 'driving' action.
Our approach is completely unsupervised and does not require any access to additional annotations like keypoints or pose.
arXiv Detail & Related papers (2024-09-24T01:25:19Z) - Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper) [8.071443524030302]
Equitable urban transportation applications require high-fidelity digital representations of the built environment.
We consider vision language models as a mechanism for annotating diverse urban features from satellite images.
We demonstrate proof-of-concept combining a state-of-the-art vision language model and variants of a prompting strategy.
arXiv Detail & Related papers (2024-08-01T21:50:23Z) - Improving classification of road surface conditions via road area extraction and contrastive learning [2.9109581496560044]
We introduce a segmentation model to only focus the downstream classification model to the road surface in the image.
Our experiments on the public RTK dataset demonstrate a significant improvement in our proposed method.
arXiv Detail & Related papers (2024-07-19T15:43:16Z) - NeRO: Neural Road Surface Reconstruction [15.99050337416157]
This paper introduces a position encoding Multi-Layer Perceptrons (MLPs) framework to reconstruct road surfaces, with input as world coordinates x and y, and output as height, color, and semantic information.
The effectiveness of this method is demonstrated through its compatibility with a variety of road height sources like vehicle camera poses, LiDAR point clouds, and SFM point clouds, robust to the semantic noise of images like sparse labels and noise semantic prediction, and fast training speed.
arXiv Detail & Related papers (2024-05-17T05:41:45Z) - Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare.
We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues.
We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z) - CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle
Components [77.33782775860028]
We introduce CarPatch, a novel synthetic benchmark of vehicles.
In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view.
Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques.
arXiv Detail & Related papers (2023-07-24T11:59:07Z) - Few-Shot Learning with Part Discovery and Augmentation from Unlabeled
Images [79.34600869202373]
We show that inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes.
Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations.
Our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24%.
arXiv Detail & Related papers (2021-05-25T12:22:11Z) - Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition.
Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.