Related papers: Document AI: Benchmarks, Models and Applications

Document AI: Benchmarks, Models and Applications

URL: http://arxiv.org/abs/2111.08609v1
Date: Tue, 16 Nov 2021 16:43:07 GMT
Title: Document AI: Benchmarks, Models and Applications
Authors: Lei Cui, Yiheng Xu, Tengchao Lv, Furu Wei
Abstract summary: Document AI refers to the techniques for automatically reading, understanding, and analyzing business documents. In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI. This paper briefly reviews some of the representative models, tasks, and benchmark datasets.
Score: 35.46858492311289
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques for automatically reading, understanding, and analyzing business documents. It is an important research direction for natural language processing and computer vision. In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI, such as document layout analysis, visual information extraction, document visual question answering, document image classification, etc. This paper briefly reviews some of the representative models, tasks, and benchmark datasets. Furthermore, we also introduce early-stage heuristic rule-based document analysis, statistical machine learning algorithms, and deep learning approaches especially pre-training methods. Finally, we look into future directions for Document AI research.

Related papers

A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models [98.41645229835493]
Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making. Large foundation models, such as large language models, have revolutionized various natural language processing tasks. This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis.
arXiv Detail & Related papers (2024-03-18T17:57:09Z)
Automating the Information Extraction from Semi-Structured Interview Transcripts [0.0]
This paper explores the development and application of an automated system designed to extract information from semi-structured interview transcripts. We present a user-friendly software prototype that enables researchers to efficiently process and visualize the thematic structure of interview data.
arXiv Detail & Related papers (2024-03-07T13:53:03Z)
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis [16.86139440201837]
We focus on the topic of form understanding in the context of scanned documents. Our research methodology involves an in-depth analysis of popular documents and forms of understanding of trends over the last decade. We showcase how transformers have propelled the field forward, revolutionizing form-understanding techniques.
arXiv Detail & Related papers (2024-03-06T22:22:02Z)
Artificial intelligence to automate the systematic review of scientific literature [0.0]
We present a survey of AI techniques proposed in the last 15 years to help researchers conduct systematic analyses of scientific literature. We describe the tasks currently supported, the types of algorithms applied, and available tools proposed in 34 primary studies.
arXiv Detail & Related papers (2024-01-13T19:12:49Z)
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis [3.231170156689185]
Document AI aims to automatically analyze documents by leveraging natural language processing and computer vision techniques. One of the major tasks of Document AI is document layout analysis, which structures document pages by interpreting the content and spatial relationships of layout, image, and text.
arXiv Detail & Related papers (2023-08-29T16:58:03Z)
An approach based on Open Research Knowledge Graph for Knowledge Acquisition from scientific papers [4.8951183832371]
Open Research Knowledge Graph (ORKG) is a computer-assisted tool to organize key-insights extracted from research papers. It is currently used to document "food information engineering", "Tabular data to Knowledge Graph Matching" and "Question Answering" research problems and "Neuro-symbolic AI" domain.
arXiv Detail & Related papers (2023-08-23T20:05:42Z)
Unified Pretraining Framework for Document Understanding [52.224359498792836]
We present UDoc, a new unified pretraining framework for document understanding. UDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input. An important feature of UDoc is that it learns a generic representation by making use of three self-supervised losses.
arXiv Detail & Related papers (2022-04-22T21:47:04Z)
A Survey of Deep Learning Approaches for OCR and Document Understanding [68.65995739708525]
We review different techniques for document understanding for documents written in English. We consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
arXiv Detail & Related papers (2020-11-27T03:05:59Z)
A New Neural Search and Insights Platform for Navigating and Organizing AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature. We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.