An End-to-End Approach for Recognition of Modern and Historical
Handwritten Numeral Strings
- URL: http://arxiv.org/abs/2004.03337v1
- Date: Sat, 28 Mar 2020 16:51:00 GMT
- Title: An End-to-End Approach for Recognition of Modern and Historical
Handwritten Numeral Strings
- Authors: Andre G. Hochuli, Alceu S. Britto Jr., Jean P. Barddal, Luiz E. S.
Oliveira, Robert Sabourin
- Abstract summary: An end-to-end solution for handwritten numeral string recognition is proposed.
The main contribution is to avoid string-based methods for preprocessing and segmentation.
A robust experimental protocol based on several numeral string datasets has shown that the proposed method is a feasible end-to-end solution.
- Score: 9.950131528559211
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An end-to-end solution for handwritten numeral string recognition is
proposed, in which the numeral string is considered as composed of objects
automatically detected and recognized by a YoLo-based model. The main
contribution of this paper is to avoid heuristic-based methods for string
preprocessing and segmentation, the need for task-oriented classifiers, and
also the use of specific constraints related to the string length. A robust
experimental protocol based on several numeral string datasets, including one
composed of historical documents, has shown that the proposed method is a
feasible end-to-end solution for numeral string recognition. Besides, it
reduces the complexity of the string recognition task considerably since it
drops out classical steps, in special preprocessing, segmentation, and a set of
classifiers devoted to strings with a specific length.
Related papers
- Order-agnostic Identifier for Large Language Model-based Generative Recommendation [94.37662915542603]
Items are assigned identifiers for Large Language Models (LLMs) to encode user history and generate the next item.
Existing approaches leverage either token-sequence identifiers, representing items as discrete token sequences, or single-token identifiers, using ID or semantic embeddings.
We propose SETRec, which leverages semantic tokenizers to obtain order-agnostic multi-dimensional tokens.
arXiv Detail & Related papers (2025-02-15T15:25:38Z) - Disambiguating Numeral Sequences to Decipher Ancient Accounting Corpora [7.530971114462749]
We study the ancient and partially-deciphered proto-Elamite (PE) script.
Written numerals can have up to four distinct readings depending on the system that is used to read them.
We consider the task of disambiguating between these readings in order to determine the values of the numeric quantities recorded in this corpus.
arXiv Detail & Related papers (2025-01-31T18:10:31Z) - STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM [59.08493154172207]
We propose a unified framework to streamline the semantic tokenization and generative recommendation process.
We formulate semantic tokenization as a text-to-token task and generative recommendation as a token-to-token task, supplemented by a token-to-text reconstruction task and a text-to-token auxiliary task.
All these tasks are framed in a generative manner and trained using a single large language model (LLM) backbone.
arXiv Detail & Related papers (2024-09-11T13:49:48Z) - Out of Length Text Recognition with Sub-String Matching [54.63761108308825]
In this paper, we term this task Out of Length (OOL) text recognition.
We propose a novel method called OOL Text Recognition with sub-String Matching (SMTR)
SMTR comprises two cross-attention-based modules: one encodes a sub-string containing multiple characters into next and previous queries, and the other employs the queries to attend to the image features.
arXiv Detail & Related papers (2024-07-17T05:02:17Z) - Token Alignment via Character Matching for Subword Completion [34.76794239097628]
This paper examines a technique to alleviate the tokenization artifact on text completion in generative models.
The method, termed token alignment, involves backtracking to the last complete tokens and ensuring the model's generation aligns with the prompt.
arXiv Detail & Related papers (2024-03-13T16:44:39Z) - Large Language Model Prompt Chaining for Long Legal Document
Classification [2.3148470932285665]
Chaining is a strategy used to decompose complex tasks into smaller, manageable components.
We demonstrate that through prompt chaining, we can not only enhance the performance over zero-shot, but also surpass the micro-F1 score achieved by larger models.
arXiv Detail & Related papers (2023-08-08T08:57:01Z) - Multiview Identifiers Enhanced Generative Retrieval [78.38443356800848]
generative retrieval generates identifier strings of passages as the retrieval target.
We propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage.
Our proposed approach performs the best in generative retrieval, demonstrating its effectiveness and robustness.
arXiv Detail & Related papers (2023-05-26T06:50:21Z) - Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - On Parsing as Tagging [66.31276017088477]
We show how to reduce tetratagging, a state-of-the-art constituency tagger, to shift--reduce parsing.
We empirically evaluate our taxonomy of tagging pipelines with different choices of linearizers, learners, and decoders.
arXiv Detail & Related papers (2022-11-14T13:37:07Z) - End-to-End Approach for Recognition of Historical Digit Strings [2.0754848504005583]
We propose an end-to-end segmentation-free deep learning approach to handle challenging ancient handwriting style of dates present in the ARDIS dataset (4-digits long strings)
We show that with slight modifications in the VGG-16 deep model, the framework can achieve a recognition rate of 93.2%, resulting in a feasible solution free of methods, segmentation, and fusion methods.
arXiv Detail & Related papers (2021-04-28T09:39:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.