Leveraging LLMs and attention-mechanism for automatic annotation of historical maps
- URL: http://arxiv.org/abs/2504.11050v1
- Date: Tue, 15 Apr 2025 10:34:23 GMT
- Title: Leveraging LLMs and attention-mechanism for automatic annotation of historical maps
- Authors: Yunshuang Yuan, Monika Sester,
- Abstract summary: Recent advancements in machine learning have opened new avenues for automating the recognition and classification of features and objects in historical maps.<n>We propose a novel distillation method that leverages large language models (LLMs) and attention mechanisms for the automatic annotation of historical maps.<n>LLMs are employed to generate coarse classification labels for low-resolution historical image patches, while attention mechanisms are utilized to refine these labels to higher resolutions.
- Score: 0.552480439325792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Historical maps are essential resources that provide insights into the geographical landscapes of the past. They serve as valuable tools for researchers across disciplines such as history, geography, and urban studies, facilitating the reconstruction of historical environments and the analysis of spatial transformations over time. However, when constrained to analogue or scanned formats, their interpretation is limited to humans and therefore not scalable. Recent advancements in machine learning, particularly in computer vision and large language models (LLMs), have opened new avenues for automating the recognition and classification of features and objects in historical maps. In this paper, we propose a novel distillation method that leverages LLMs and attention mechanisms for the automatic annotation of historical maps. LLMs are employed to generate coarse classification labels for low-resolution historical image patches, while attention mechanisms are utilized to refine these labels to higher resolutions. Experimental results demonstrate that the refined labels achieve a high recall of more than 90%. Additionally, the intersection over union (IoU) scores--84.2% for Wood and 72.0% for Settlement--along with precision scores of 87.1% and 79.5%, respectively, indicate that most labels are well-aligned with ground-truth annotations. Notably, these results were achieved without the use of fine-grained manual labels during training, underscoring the potential of our approach for efficient and scalable historical map analysis.
Related papers
- Semantic Segmentation for Sequential Historical Maps by Learning from Only One Map [0.4915744683251151]
We propose an automated approach to digitization using deep-learning-based semantic segmentation.<n>A key challenge in this process is the lack of ground-truth annotations required for training deep neural networks.<n>We introduce a weakly-supervised age-tracing strategy for model fine-tuning.
arXiv Detail & Related papers (2025-01-03T14:55:22Z) - SAModified: A Foundation Model-Based Zero-Shot Approach for Refining Noisy Land-Use Land-Cover Maps [2.374912052693646]
Land-use and land cover (LULC) analysis is critical in remote sensing.<n> automating LULC map generation using machine learning is rendered challenging due to noisy labels.<n>We propose a zero-shot approach using the foundation model, Segment Anything Model (SAM)<n>We achieve a significant reduction in label noise and an improvement in the performance of the downstream segmentation model by $approx 5%$ when trained with denoised labels.
arXiv Detail & Related papers (2024-12-17T05:23:00Z) - Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps [16.35356981558991]
We propose a novel approach that combines segmentation and association of geographic entities in historical maps using video instance segmentation (VIS)
To mitigate this challenge, we explore self-supervised learning (SSL) techniques to enhance VIS performance on historical maps.
arXiv Detail & Related papers (2024-11-26T13:31:51Z) - Learning crop type mapping from regional label proportions in
large-scale SAR and optical imagery [9.303156731091532]
This study proposes an online deep clustering method using crop label proportions as priors to learn a sample-level classifier.
We evaluate the method using two large datasets from two different agricultural regions in Brazil.
arXiv Detail & Related papers (2022-08-24T15:23:26Z) - Improving Fine-Grained Visual Recognition in Low Data Regimes via
Self-Boosting Attention Mechanism [27.628260249895973]
Self-boosting attention mechanism (SAM) is a novel method for regularizing the network to focus on the key regions shared across samples and classes.
We develop a variant by using SAM to create multiple attention maps to pool convolutional maps in a style of bilinear pooling.
arXiv Detail & Related papers (2022-08-01T05:36:27Z) - AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z) - OpenStreetMap: Challenges and Opportunities in Machine Learning and
Remote Sensing [66.23463054467653]
We present a review of recent methods based on machine learning to improve and use OpenStreetMap data.
We believe that OSM can change the way we interpret remote sensing data and that the synergy with machine learning can scale participatory map making.
arXiv Detail & Related papers (2020-07-13T09:58:14Z) - Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps [78.2581910688094]
This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
arXiv Detail & Related papers (2020-06-09T12:35:55Z) - Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels.
We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps.
Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.