Synthetic Map Generation to Provide Unlimited Training Data for
Historical Map Text Detection
- URL: http://arxiv.org/abs/2112.06104v1
- Date: Sun, 12 Dec 2021 00:27:03 GMT
- Title: Synthetic Map Generation to Provide Unlimited Training Data for
Historical Map Text Detection
- Authors: Zekun Li, Runyu Guan, Qianmu Yu, Yao-Yi Chiang and Craig A. Knoblock
- Abstract summary: We propose a method to automatically generate an unlimited amount of annotated historical map images for training text detection models.
We show that the state-of-the-art text detection models can benefit from the synthetic historical maps.
- Score: 5.872532529455414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many historical map sheets are publicly available for studies that require
long-term historical geographic data. The cartographic design of these maps
includes a combination of map symbols and text labels. Automatically reading
text labels from map images could greatly speed up the map interpretation and
helps generate rich metadata describing the map content. Many text detection
algorithms have been proposed to locate text regions in map images
automatically, but most of the algorithms are trained on out-ofdomain datasets
(e.g., scenic images). Training data determines the quality of machine learning
models, and manually annotating text regions in map images is labor-extensive
and time-consuming. On the other hand, existing geographic data sources, such
as Open- StreetMap (OSM), contain machine-readable map layers, which allow us
to separate out the text layer and obtain text label annotations easily.
However, the cartographic styles between OSM map tiles and historical maps are
significantly different. This paper proposes a method to automatically generate
an unlimited amount of annotated historical map images for training text
detection models. We use a style transfer model to convert contemporary map
images into historical style and place text labels upon them. We show that the
state-of-the-art text detection models (e.g., PSENet) can benefit from the
synthetic historical maps and achieve significant improvement for historical
map text detection.
Related papers
- An Efficient System for Automatic Map Storytelling -- A Case Study on Historical Maps [11.037615422309296]
Historical maps provide valuable information and knowledge about the past.
As they often feature non-standard projections, hand-drawn styles, and artistic elements, it is challenging for non-experts to identify and interpret them.
Existing image captioning methods have achieved remarkable success on natural images, their performance on maps is suboptimal as maps are underrepresented in their pre-training process.
Despite the recent advance of GPT-4 in text recognition and map captioning, it still has a limited understanding of maps, as its performance wanes when texts in maps are missing or inaccurate.
We propose a novel decision tree architecture to only generate captions relevant
arXiv Detail & Related papers (2024-10-21T08:45:26Z) - Efficiently Leveraging Linguistic Priors for Scene Text Spotting [63.22351047545888]
This paper proposes a method that leverages linguistic knowledge from a large text corpus to replace the traditional one-hot encoding used in auto-regressive scene text spotting and recognition models.
We generate text distributions that align well with scene text datasets, removing the need for in-domain fine-tuning.
Experimental results show that our method not only improves recognition accuracy but also enables more accurate localization of words.
arXiv Detail & Related papers (2024-02-27T01:57:09Z) - CLIM: Contrastive Language-Image Mosaic for Region Representation [58.05870131126816]
Contrastive Language-Image Mosaic (CLIM) is a novel approach for aligning region and text representations.
CLIM consistently improves different open-vocabulary object detection methods.
It can effectively enhance the region representation of vision-language models.
arXiv Detail & Related papers (2023-12-18T17:39:47Z) - CartoMark: a benchmark dataset for map pattern recognition and 1 map
content retrieval with machine intelligence [9.652629004863364]
We develop a large-scale benchmark dataset for map text annotation recognition, map scene classification, map super-resolution reconstruction, and map style transferring.
These well-labelled datasets would facilitate the state-of-the-art machine intelligence technologies to conduct map feature detection, map pattern recognition and map content retrieval.
arXiv Detail & Related papers (2023-12-14T01:54:38Z) - The mapKurator System: A Complete Pipeline for Extracting and Linking
Text from Historical Maps [7.209761597734092]
mapKurator is an end-to-end system integrating machine learning models with a comprehensive data processing pipeline.
We deployed the mapKurator system and enabled the processing of over 60,000 maps and over 100 million text/place names in the David Rumsey Historical Map collection.
arXiv Detail & Related papers (2023-06-29T16:05:40Z) - SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic
Understanding [57.108301842535894]
We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images.
We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images.
SNAP can resolve the location of challenging image queries beyond the reach of traditional methods.
arXiv Detail & Related papers (2023-06-08T17:54:47Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - An Automatic Approach for Generating Rich, Linked Geo-Metadata from
Historical Map Images [6.962949867017594]
This paper presents an end-to-end approach to address the real-world problem of finding and indexing historical map images.
We have implemented the approach in a system called mapKurator.
arXiv Detail & Related papers (2021-12-03T01:44:38Z) - MapReader: A Computer Vision Pipeline for the Semantic Exploration of
Maps at Scale [1.5894241142512051]
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital)
MapReader allows users with little or no computer vision expertise to retrieve maps via web-servers.
We show how the outputs from the MapReader pipeline can be linked to other, external datasets.
arXiv Detail & Related papers (2021-11-30T17:37:01Z) - Semantic Image Alignment for Vehicle Localization [111.59616433224662]
We present a novel approach to vehicle localization in dense semantic maps using semantic segmentation from a monocular camera.
In contrast to existing visual localization approaches, the system does not require additional keypoint features, handcrafted localization landmark extractors or expensive LiDAR sensors.
arXiv Detail & Related papers (2021-10-08T14:40:15Z) - Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems.
We present three novel scenarios for localization and mapping which require the continuous update of feature representations.
Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.