Pattern Spotting and Image Retrieval in Historical Documents using Deep
Hashing
- URL: http://arxiv.org/abs/2208.02397v1
- Date: Thu, 4 Aug 2022 01:39:37 GMT
- Title: Pattern Spotting and Image Retrieval in Historical Documents using Deep
Hashing
- Authors: Caio da S. Dias, Alceu de S. Britto Jr., Jean P. Barddal, Laurent
Heutte, Alessandro L. Koerich
- Abstract summary: This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents.
Deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations.
The proposed approach also reduces the search time by up to 200x and the storage cost up to 6,000x when compared to related works.
- Score: 60.67014034968582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a deep learning approach for image retrieval and pattern
spotting in digital collections of historical documents. First, a region
proposal algorithm detects object candidates in the document page images. Next,
deep learning models are used for feature extraction, considering two distinct
variants, which provide either real-valued or binary code representations.
Finally, candidate images are ranked by computing the feature similarity with a
given input query. A robust experimental protocol evaluates the proposed
approach considering each representation scheme (real-valued and binary code)
on the DocExplore image database. The experimental results show that the
proposed deep models compare favorably to the state-of-the-art image retrieval
approaches for images of historical documents, outperforming other deep models
by 2.56 percentage points using the same techniques for pattern spotting.
Besides, the proposed approach also reduces the search time by up to 200x and
the storage cost up to 6,000x when compared to related works based on
real-valued representations.
Related papers
- Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval [55.90407811819347]
We consider the task of paraphrased text-to-image retrieval where a model aims to return similar results given a pair of paraphrased queries.
We train a dual-encoder model starting from a language model pretrained on a large text corpus.
Compared to public dual-encoder models such as CLIP and OpenCLIP, the model trained with our best adaptation strategy achieves a significantly higher ranking similarity for paraphrased queries.
arXiv Detail & Related papers (2024-05-06T06:30:17Z) - A Fair Evaluation of Various Deep Learning-Based Document Image
Binarization Approaches [5.393847875065119]
Binarization of document images is an important pre-processing step in the field of document analysis.
Deep learning techniques are able to generate binarized versions of the images by learning context-dependent features.
This work focuses on the evaluation of different deep learning-based methods under the same evaluation protocol.
arXiv Detail & Related papers (2024-01-22T10:42:51Z) - Where Does the Performance Improvement Come From? - A Reproducibility
Concern about Image-Text Retrieval [85.03655458677295]
Image-text retrieval has gradually become a major research direction in the field of information retrieval.
We first examine the related concerns and why the focus is on image-text retrieval tasks.
We analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models.
arXiv Detail & Related papers (2022-03-08T05:01:43Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Date Estimation in the Wild of Scanned Historical Photos: An Image
Retrieval Approach [3.5698678013121334]
This paper presents a novel method for date estimation of historical photographs from archival sources.
The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity.
We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval.
arXiv Detail & Related papers (2021-06-10T09:53:03Z) - Spatial Dual-Modality Graph Reasoning for Key Information Extraction [31.04597531115209]
We propose an end-to-end Spatial Dual-Modality Graph Reasoning method (SDMG-R) to extract key information from unstructured document images.
We release a new dataset named WildReceipt, which is collected and annotated for the evaluation of key information extraction from document images of unseen templates in the wild.
arXiv Detail & Related papers (2021-03-26T13:46:00Z) - An Unsupervised Sampling Approach for Image-Sentence Matching Using
Document-Level Structural Information [64.66785523187845]
We focus on the problem of unsupervised image-sentence matching.
Existing research explores to utilize document-level structural information to sample positive and negative instances for model training.
We propose a new sampling strategy to select additional intra-document image-sentence pairs as positive or negative samples.
arXiv Detail & Related papers (2021-03-21T05:43:29Z) - Incorporating Vision Bias into Click Models for Image-oriented Search
Engine [51.192784793764176]
In this paper, we assume that vision bias exists in an image-oriented search engine as another crucial factor affecting the examination probability aside from position.
We use regression-based EM algorithm to predict the vision bias given the visual features extracted from candidate documents.
arXiv Detail & Related papers (2021-01-07T10:01:31Z) - Progressive Local Filter Pruning for Image Retrieval Acceleration [43.97722250091591]
We propose a new Progressive Local Filter Pruning (PLFP) method for image retrieval acceleration.
Specifically, layer by layer, we analyze the local geometric properties of each filter and select the one that can be replaced by the neighbors.
In this way, the representation ability of the model is preserved.
arXiv Detail & Related papers (2020-01-24T04:28:44Z) - Image retrieval approach based on local texture information derived from
predefined patterns and spatial domain information [14.620086904601472]
The performance of the proposed method is evaluated in terms of precision and recall on the Simplicity database.
The comparative results showed that the proposed approach offers higher precision rate than many known methods.
arXiv Detail & Related papers (2019-12-30T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.