Transformer-Based Approach for Joint Handwriting and Named Entity
Recognition in Historical documents
- URL: http://arxiv.org/abs/2112.04189v1
- Date: Wed, 8 Dec 2021 09:26:21 GMT
- Title: Transformer-Based Approach for Joint Handwriting and Named Entity
Recognition in Historical documents
- Authors: Ahmed Cheikh Rouhoua, Marwa Dhiaf, Yousri Kessentini, Sinda Ben Salem
- Abstract summary: This work presents the first approach that adopts the transformer networks for named entity recognition in handwritten documents.
We achieve the new state-of-the-art performance in the ICDAR 2017 Information Extraction competition using the Esposalles database.
- Score: 1.7491858164568674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The extraction of relevant information carried out by named entities in
handwriting documents is still a challenging task. Unlike traditional
information extraction approaches that usually face text transcription and
named entity recognition as separate subsequent tasks, we propose in this paper
an end-to-end transformer-based approach to jointly perform these two tasks.
The proposed approach operates at the paragraph level, which brings two main
benefits. First, it allows the model to avoid unrecoverable early errors due to
line segmentation. Second, it allows the model to exploit larger bi-dimensional
context information to identify the semantic categories, reaching a higher
final prediction accuracy. We also explore different training scenarios to show
their effect on the performance and we demonstrate that a two-stage learning
strategy can make the model reach a higher final prediction accuracy. As far as
we know, this work presents the first approach that adopts the transformer
networks for named entity recognition in handwritten documents. We achieve the
new state-of-the-art performance in the ICDAR 2017 Information Extraction
competition using the Esposalles database, for the complete task, even though
the proposed technique does not use any dictionaries, language modeling, or
post-processing.
Related papers
- Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - Information Extraction in Domain and Generic Documents: Findings from
Heuristic-based and Data-driven Approaches [0.0]
Information extraction plays important role in natural language processing.
Document genre and length influence on IE tasks.
No single method demonstrated overwhelming performance in both tasks.
arXiv Detail & Related papers (2023-06-30T20:43:27Z) - Key-value information extraction from full handwritten pages [0.2062593640149624]
We propose a Transformer-based approach for information extraction from digitized handwritten documents.
Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition.
We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.
arXiv Detail & Related papers (2023-04-26T13:06:55Z) - Towards End-to-end Handwritten Document Recognition [0.0]
Handwritten text recognition has been widely studied in the last decades for its numerous applications.
In this thesis, we propose to tackle these issues by performing the handwritten text recognition of whole document in an end-to-end way.
We reached state-of-the-art results at paragraph level on the RIMES 2011, IAM and READ 2016 datasets and outperformed the line-level state of the art on these datasets.
arXiv Detail & Related papers (2022-09-30T10:31:22Z) - Incorporating Task-specific Concept Knowledge into Script Learning [68.95195207989605]
We present Tetris, a new task of Goal-Oriented Script Completion.
It considers a more realistic and general setting, where the input includes not only the goal but also additional user context.
We propose a novel approach, which uses two techniques to improve performance.
arXiv Detail & Related papers (2022-08-31T18:55:22Z) - A Span Extraction Approach for Information Extraction on Visually-Rich
Documents [2.3131309703965135]
We present a new approach to improve the capability of language model pre-training on visually-rich documents (VRDs)
Firstly, we introduce a new IE model that is query-based and employs the span extraction formulation instead of the commonly used sequence labelling approach.
We also propose a new training task which focuses on modelling the relationships between semantic entities within a document.
arXiv Detail & Related papers (2021-06-02T06:50:04Z) - Focused Attention Improves Document-Grounded Generation [111.42360617630669]
Document grounded generation is the task of using the information provided in a document to improve text generation.
This work focuses on two different document grounded generation tasks: Wikipedia Update Generation task and Dialogue response generation.
arXiv Detail & Related papers (2021-04-26T16:56:29Z) - Incomplete Utterance Rewriting as Semantic Segmentation [57.13577518412252]
We present a novel and extensive approach, which formulates it as a semantic segmentation task.
Instead of generating from scratch, such a formulation introduces edit operations and shapes the problem as prediction of a word-level edit matrix.
Our approach is four times faster than the standard approach in inference.
arXiv Detail & Related papers (2020-09-28T09:29:49Z) - Detecting Ongoing Events Using Contextual Word and Sentence Embeddings [110.83289076967895]
This paper introduces the Ongoing Event Detection (OED) task.
The goal is to detect ongoing event mentions only, as opposed to historical, future, hypothetical, or other forms or events that are neither fresh nor current.
Any application that needs to extract structured information about ongoing events from unstructured texts can take advantage of an OED system.
arXiv Detail & Related papers (2020-07-02T20:44:05Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.