A Survey of Deep Learning Approaches for OCR and Document Understanding
- URL: http://arxiv.org/abs/2011.13534v2
- Date: Thu, 4 Feb 2021 23:48:39 GMT
- Title: A Survey of Deep Learning Approaches for OCR and Document Understanding
- Authors: Nishant Subramani and Alexandre Matton and Malcolm Greaves and Adrian
Lam
- Abstract summary: We review different techniques for document understanding for documents written in English.
We consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
- Score: 68.65995739708525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Documents are a core part of many businesses in many fields such as law,
finance, and technology among others. Automatic understanding of documents such
as invoices, contracts, and resumes is lucrative, opening up many new avenues
of business. The fields of natural language processing and computer vision have
seen tremendous progress through the development of deep learning such that
these methods have started to become infused in contemporary document
understanding systems. In this survey paper, we review different techniques for
document understanding for documents written in English and consolidate
methodologies present in literature to act as a jumping-off point for
researchers exploring this area.
Related papers
- Understanding Archives: Towards New Research Interfaces Relying on the Semantic Annotation of Documents [0.2302001830524133]
We show how the semantic annotation of the textual content of study corpora of archival documents allow to facilitate their exploitation and valorisation.
First, we present a methodological framework for the construction of new interfaces based on textual semantics, then address the current technological obstacles and their potential solutions.
arXiv Detail & Related papers (2024-03-28T07:55:29Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Information extraction aims to extract structural knowledge from plain natural language texts.
generative Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
LLMs offer viable solutions for IE tasks based on a generative paradigm.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Workshop on Document Intelligence Understanding [3.2929609168290543]
This workshop aims to bring together researchers and industry developers in the field of document intelligence.
We also released a data challenge on the recently introduced document-level VQA dataset, PDFVQA.
arXiv Detail & Related papers (2023-07-31T02:14:25Z) - DLUE: Benchmarking Document Language Understanding [32.550855843975484]
There is no well-established consensus on how to comprehensively evaluate document understanding abilities.
This paper summarizes four representative abilities, i.e., document classification, document structural analysis, document information extraction, and document transcription.
Under the new evaluation framework, we propose textbfDocument Language Understanding Evaluation -- textbfDLUE, a new task suite.
arXiv Detail & Related papers (2023-05-16T15:16:24Z) - Layout-Aware Information Extraction for Document-Grounded Dialogue:
Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents.
LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents.
Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z) - Embedding Knowledge for Document Summarization: A Survey [66.76415502727802]
Previous works proved that knowledge-embedded document summarizers excel at generating superior digests.
We propose novel to recapitulate knowledge and knowledge embeddings under the document summarization view.
arXiv Detail & Related papers (2022-04-24T04:36:07Z) - Unified Pretraining Framework for Document Understanding [52.224359498792836]
We present UDoc, a new unified pretraining framework for document understanding.
UDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input.
An important feature of UDoc is that it learns a generic representation by making use of three self-supervised losses.
arXiv Detail & Related papers (2022-04-22T21:47:04Z) - Document AI: Benchmarks, Models and Applications [35.46858492311289]
Document AI refers to the techniques for automatically reading, understanding, and analyzing business documents.
In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI.
This paper briefly reviews some of the representative models, tasks, and benchmark datasets.
arXiv Detail & Related papers (2021-11-16T16:43:07Z) - Historical Document Processing: Historical Document Processing: A Survey
of Techniques, Tools, and Trends [0.0]
Historical Document Processing is the process of digitizing written material from the past for future use by historians and other scholars.
It incorporates algorithms and software tools from various subfields of computer science, including computer vision, document analysis and recognition, natural language processing, and machine learning.
arXiv Detail & Related papers (2020-02-15T01:54:35Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.