Related papers: An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

URL: http://arxiv.org/abs/2112.01671v1
Date: Fri, 3 Dec 2021 01:44:38 GMT
Title: An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images
Authors: Zekun Li, Yao-Yi Chiang, Sasan Tavakkol, Basel Shbita, Johannes H. Uhl, Stefan Leyk, and Craig A. Knoblock
Abstract summary: This paper presents an end-to-end approach to address the real-world problem of finding and indexing historical map images. We have implemented the approach in a system called mapKurator.
Score: 6.962949867017594
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Historical maps contain detailed geographic information difficult to find elsewhere covering long-periods of time (e.g., 125 years for the historical topographic maps in the US). However, these maps typically exist as scanned images without searchable metadata. Existing approaches making historical maps searchable rely on tedious manual work (including crowd-sourcing) to generate the metadata (e.g., geolocations and keywords). Optical character recognition (OCR) software could alleviate the required manual work, but the recognition results are individual words instead of location phrases (e.g., "Black" and "Mountain" vs. "Black Mountain"). This paper presents an end-to-end approach to address the real-world problem of finding and indexing historical map images. This approach automatically processes historical map images to extract their text content and generates a set of metadata that is linked to large external geospatial knowledge bases. The linked metadata in the RDF (Resource Description Framework) format support complex queries for finding and indexing historical maps, such as retrieving all historical maps covering mountain peaks higher than 1,000 meters in California. We have implemented the approach in a system called mapKurator. We have evaluated mapKurator using historical maps from several sources with various map styles, scales, and coverage. Our results show significant improvement over the state-of-the-art methods. The code has been made publicly available as modules of the Kartta Labs project at https://github.com/kartta-labs/Project.

Related papers

Hyper-Local Deformable Transformers for Text Spotting on Historical Maps [2.423679070137552]
Text on historical maps contains valuable information providing georeferenced historical, political, and cultural contexts.<n>Previous approaches use ad-hoc steps tailored to only specific map styles.<n>Recent machine learning-based text spotters have the potential to solve these challenges.<n>This paper proposes PALETTE, an end-to-end text spotter for scanned historical maps.
arXiv Detail & Related papers (2025-06-17T22:41:10Z)
MapQaTor: A System for Efficient Annotation of Map Query Datasets [3.3856216159724983]
MapQaTor is a web application that streamlines the creation of reproducible, traceable map-based QA datasets. With its plug-and-play architecture, MapQaTor enables seamless integration with any maps API.
arXiv Detail & Related papers (2024-12-30T15:33:19Z)
MapExplorer: New Content Generation from Low-Dimensional Visualizations [60.02149343347818]
Low-dimensional visualizations, or "projection maps," are widely used to interpret large-scale and complex datasets.<n>These visualizations not only aid in understanding existing knowledge spaces but also implicitly guide exploration into unknown areas.<n>We introduce MapExplorer, a novel knowledge discovery task that translates coordinates within any projection map into coherent, contextually aligned textual content.
arXiv Detail & Related papers (2024-12-24T20:16:13Z)
An Efficient System for Automatic Map Storytelling -- A Case Study on Historical Maps [11.037615422309296]
Historical maps provide valuable information and knowledge about the past. As they often feature non-standard projections, hand-drawn styles, and artistic elements, it is challenging for non-experts to identify and interpret them. Existing image captioning methods have achieved remarkable success on natural images, their performance on maps is suboptimal as maps are underrepresented in their pre-training process. Despite the recent advance of GPT-4 in text recognition and map captioning, it still has a limited understanding of maps, as its performance wanes when texts in maps are missing or inaccurate. We propose a novel decision tree architecture to only generate captions relevant
arXiv Detail & Related papers (2024-10-21T08:45:26Z)
Integrating Visual and Textual Inputs for Searching Large-Scale Map Collections with CLIP [0.09208007322096533]
We explore the potential for interactively searching large-scale map collections using natural language inputs. As a case study, we adopt 562,842 images of maps publicly accessible via the Library of Congress's API. We present results for example searches created in consultation with staff in the Library of Congress's Geography and Map Division.
arXiv Detail & Related papers (2024-10-02T02:51:02Z)
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth. Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task. We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z)
The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps [7.209761597734092]
mapKurator is an end-to-end system integrating machine learning models with a comprehensive data processing pipeline. We deployed the mapKurator system and enabled the processing of over 60,000 maps and over 100 million text/place names in the David Rumsey Historical Map collection.
arXiv Detail & Related papers (2023-06-29T16:05:40Z)
G^3: Geolocation via Guidebook Grounding [92.46774241823562]
We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation. We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations. Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
arXiv Detail & Related papers (2022-11-28T16:34:40Z)
GAMa: Cross-view Video Geo-localization [68.33955764543465]
We focus on ground videos instead of images which provides contextual cues. At clip-level, a short video clip is matched with corresponding aerial image and is later used to get video-level geo-localization of a long video. Our proposed method achieves a Top-1 recall rate of 19.4% and 45.1% @1.0mile.
arXiv Detail & Related papers (2022-07-06T04:25:51Z)
Synthetic Map Generation to Provide Unlimited Training Data for Historical Map Text Detection [5.872532529455414]
We propose a method to automatically generate an unlimited amount of annotated historical map images for training text detection models. We show that the state-of-the-art text detection models can benefit from the synthetic historical maps.
arXiv Detail & Related papers (2021-12-12T00:27:03Z)
MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale [1.5894241142512051]
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital) MapReader allows users with little or no computer vision expertise to retrieve maps via web-servers. We show how the outputs from the MapReader pipeline can be linked to other, external datasets.
arXiv Detail & Related papers (2021-11-30T17:37:01Z)
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps [81.86923212296863]
HD maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. There are only a small amount of real-world road topologies and geometries, which significantly limits our ability to test out the self-driving stack. We propose HDMapGen, a hierarchical graph generation model capable of producing high-quality and diverse HD maps.
arXiv Detail & Related papers (2021-06-28T17:59:30Z)
HDNET: Exploiting HD Maps for 3D Object Detection [99.49035895393934]
We show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors. We design a single stage detector that extracts geometric and semantic features from the HD maps. As maps might not be available everywhere, we also propose a map prediction module that estimates the map on the fly from raw LiDAR data.
arXiv Detail & Related papers (2020-12-21T21:59:54Z)
OpenStreetMap: Challenges and Opportunities in Machine Learning and Remote Sensing [66.23463054467653]
We present a review of recent methods based on machine learning to improve and use OpenStreetMap data. We believe that OSM can change the way we interpret remote sensing data and that the synergy with machine learning can scale participatory map making.
arXiv Detail & Related papers (2020-07-13T09:58:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.