A Scalable Machine Learning Pipeline for Building Footprint Detection in Historical Maps
- URL: http://arxiv.org/abs/2508.03564v1
- Date: Tue, 05 Aug 2025 15:33:29 GMT
- Title: A Scalable Machine Learning Pipeline for Building Footprint Detection in Historical Maps
- Authors: Annemarie McCarthy,
- Abstract summary: This paper proposes a scalable and efficient pipeline tailored to rural maps with sparse building distributions.<n>The pipeline is validated using test sections from the Ordnance Survey Ireland historical 25 inch map series and 6 inch map series.<n> Notably, the pipeline identified a settlement of approximately 22 buildings in Tully, Co. Galway, present in the 6 inch map, produced in 1839, but absent from the 25 inch map, produced in 1899, suggesting it may have been abandoned during the Great Famine period.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Historical maps offer a valuable lens through which to study past landscapes and settlement patterns. While prior research has leveraged machine learning based techniques to extract building footprints from historical maps, such approaches have largely focused on urban areas and tend to be computationally intensive. This presents a challenge for research questions requiring analysis across extensive rural regions, such as verifying historical census data or locating abandoned settlements. In this paper, this limitation is addressed by proposing a scalable and efficient pipeline tailored to rural maps with sparse building distributions. The method described employs a hierarchical machine learning based approach: convolutional neural network (CNN) classifiers are first used to progressively filter out map sections unlikely to contain buildings, significantly reducing the area requiring detailed analysis. The remaining high probability sections are then processed using CNN segmentation algorithms to extract building features. The pipeline is validated using test sections from the Ordnance Survey Ireland historical 25 inch map series and 6 inch map series, demonstrating both high performance and improved efficiency compared to conventional segmentation-only approaches. Application of the technique to both map series, covering the same geographic region, highlights its potential for historical and archaeological discovery. Notably, the pipeline identified a settlement of approximately 22 buildings in Tully, Co. Galway, present in the 6 inch map, produced in 1839, but absent from the 25 inch map, produced in 1899, suggesting it may have been abandoned during the Great Famine period.
Related papers
- Semantic Segmentation for Sequential Historical Maps by Learning from Only One Map [0.4915744683251151]
We propose an automated approach to digitization using deep-learning-based semantic segmentation.<n>A key challenge in this process is the lack of ground-truth annotations required for training deep neural networks.<n>We introduce a weakly-supervised age-tracing strategy for model fine-tuning.
arXiv Detail & Related papers (2025-01-03T14:55:22Z) - TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - Radio Map Estimation -- An Open Dataset with Directive Transmitter
Antennas and Initial Experiments [49.61405888107356]
We release a dataset of simulated path loss radio maps together with realistic city maps from real-world locations and aerial images from open datasources.
Initial experiments regarding model architectures, input feature design and estimation of radio maps from aerial images are presented.
arXiv Detail & Related papers (2024-01-12T14:56:45Z) - The mapKurator System: A Complete Pipeline for Extracting and Linking
Text from Historical Maps [7.209761597734092]
mapKurator is an end-to-end system integrating machine learning models with a comprehensive data processing pipeline.
We deployed the mapKurator system and enabled the processing of over 60,000 maps and over 100 million text/place names in the David Rumsey Historical Map collection.
arXiv Detail & Related papers (2023-06-29T16:05:40Z) - Rethinking Spatial Invariance of Convolutional Networks for Object
Counting [119.83017534355842]
We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map.
Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution.
Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
arXiv Detail & Related papers (2022-06-10T17:51:25Z) - Identifying Wetland Areas in Historical Maps using Deep Convolutional
Neural Networks [0.0]
This work extracts information on the historical location and geographical distribution of wetlands from hand-drawn maps.
A CNN model is trained on a manually pre-labelled dataset on historical wetlands in the area of J"onk"oping county in Sweden.
The trained models are additionally used to generate a GIS layer of the presumable historical geographical distribution of wetlands.
arXiv Detail & Related papers (2021-08-09T15:08:07Z) - FloorLevel-Net: Recognizing Floor-Level Lines with
Height-Attention-Guided Multi-task Learning [49.30194762653723]
This work tackles the problem of locating floor-level lines in street-view images, using a supervised deep learning approach.
We first compile a new dataset and develop a new data augmentation scheme to synthesize training samples.
Next, we design FloorLevel-Net, a multi-task learning network that associates explicit features of building facades and implicit floor-level lines.
arXiv Detail & Related papers (2021-07-06T08:17:59Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Combining Deep Learning and Mathematical Morphology for Historical Map
Segmentation [22.050293193182238]
Main map features can be retrieved and tracked through the time for subsequent thematic analysis.
The goal of this work is the vectorization step, i.e., the extraction of vector shapes of the objects of interest from images of maps.
We are particularly interested in closed shape detection such as buildings, building blocks, gardens, rivers, etc. in order to monitor their temporal evolution.
arXiv Detail & Related papers (2021-01-06T17:24:57Z) - Automatic extraction of road intersection points from USGS historical
map series using deep convolutional neural networks [0.0]
Road intersections data have been used across different geospatial applications and analysis.
We employed the standard paradigm of using deep convolutional neural network for object detection task named region-based CNN.
Also, compared to the majority of traditional computer vision algorithms RCNN provides more accurate extraction.
arXiv Detail & Related papers (2020-07-14T23:51:15Z) - Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps [78.2581910688094]
This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
arXiv Detail & Related papers (2020-06-09T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.