Boosting offline handwritten text recognition in historical documents
with few labeled lines
- URL: http://arxiv.org/abs/2012.02544v1
- Date: Fri, 4 Dec 2020 11:59:35 GMT
- Title: Boosting offline handwritten text recognition in historical documents
with few labeled lines
- Authors: Jos\'e Carlos Aradillas, Juan Jos\'e Murillo-Fuentes, Pablo M. Olmos
- Abstract summary: We analyze how to perform transfer learning from a massive database to a smaller historical database.
Second, we analyze methods to efficiently combine TL and data augmentation.
An algorithm to mitigate the effects of incorrect labelings in the training set is proposed.
- Score: 5.9207487081080705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we face the problem of offline handwritten text recognition
(HTR) in historical documents when few labeled samples are available and some
of them contain errors in the train set. Three main contributions are
developed. First we analyze how to perform transfer learning (TL) from a
massive database to a smaller historical database, analyzing which layers of
the model need a fine-tuning process. Second, we analyze methods to efficiently
combine TL and data augmentation (DA). Finally, an algorithm to mitigate the
effects of incorrect labelings in the training set is proposed. The methods are
analyzed over the ICFHR 2018 competition database, Washington and Parzival.
Combining all these techniques, we demonstrate a remarkable reduction of CER
(up to 6% in some cases) in the test set with little complexity overhead.
Related papers
- Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain.
We propose an adversarial algorithm to make the retriever component robust against distribution shift.
We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z) - Improving Text Embeddings with Large Language Models [59.930513259982725]
We introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps.
We leverage proprietary LLMs to generate diverse synthetic data for hundreds of thousands of text embedding tasks across 93 languages.
Experiments demonstrate that our method achieves strong performance on highly competitive text embedding benchmarks without using any labeled data.
arXiv Detail & Related papers (2023-12-31T02:13:18Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Domain-Specific NER via Retrieving Correlated Samples [37.98414661072985]
In this paper, we suggest enhancing NER models with correlated samples.
To explicitly simulate the human reasoning process, we perform a training-free entity type calibrating by majority voting.
Empirical results on datasets of the above two domains show the efficacy of our methods.
arXiv Detail & Related papers (2022-08-27T12:25:24Z) - Self-paced learning to improve text row detection in historical
documents with missing lables [25.22937684446941]
We propose a self-paced learning algorithm capable of improving the row detection performance.
We sort training examples in descending order with respect to the number of ground-truth bounding boxes.
Using our self-paced learning method, we train a row detector over k iterations, progressively adding batches with less ground-truth annotations.
arXiv Detail & Related papers (2022-01-28T16:17:48Z) - Text-Based Person Search with Limited Data [66.26504077270356]
Text-based person search (TBPS) aims at retrieving a target person from an image gallery with a descriptive text query.
We present a framework with two novel components to handle the problems brought by limited data.
arXiv Detail & Related papers (2021-10-20T22:20:47Z) - One-shot Compositional Data Generation for Low Resource Handwritten Text
Recognition [10.473427493876422]
Low resource Handwritten Text Recognition is a hard problem due to the scarce annotated data and the very limited linguistic information.
In this paper we address this problem through a data generation technique based on Bayesian Program Learning.
Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet.
arXiv Detail & Related papers (2021-05-11T18:53:01Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Offline Handwritten Chinese Text Recognition with Convolutional Neural
Networks [5.984124397831814]
In this paper, we build the models using only the convolutional neural networks and use CTC as the loss function.
We achieve 6.81% character error rate (CER) on the ICDAR 2013 competition set, which is the best published result without language model correction.
arXiv Detail & Related papers (2020-06-28T14:34:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.