Document Automation Architectures: Updated Survey in Light of Large
Language Models
- URL: http://arxiv.org/abs/2308.09341v1
- Date: Fri, 18 Aug 2023 06:59:55 GMT
- Title: Document Automation Architectures: Updated Survey in Light of Large
Language Models
- Authors: Mohammad Ahmadi Achachlouei, Omkar Patil, Tarun Joshi, Vijayan N. Nair
- Abstract summary: This paper surveys the current state of the art in document automation (DA)
The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates.
There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies.
- Score: 2.990411348977783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper surveys the current state of the art in document automation (DA).
The objective of DA is to reduce the manual effort during the generation of
documents by automatically creating and integrating input from different
sources and assembling documents conforming to defined templates. There have
been reviews of commercial solutions of DA, particularly in the legal domain,
but to date there has been no comprehensive review of the academic research on
DA architectures and technologies. The current survey of DA reviews the
academic literature and provides a clearer definition and characterization of
DA and its features, identifies state-of-the-art DA architectures and
technologies in academic research, and provides ideas that can lead to new
research opportunities within the DA field in light of recent advances in
generative AI and large language models.
Related papers
- Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Document Understanding Dataset and Evaluation (DUDE) [29.78902147806488]
Document Understanding dataset and evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs)
We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins, and dates.
arXiv Detail & Related papers (2023-05-15T08:54:32Z) - Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z) - A Survey on Open Information Extraction from Rule-based Model to Large Language Model [29.017823043117144]
Open Information Extraction (OpenIE) represents a crucial NLP task aimed at deriving structured information from unstructured text.
This survey paper provides an overview of OpenIE technologies spanning from 2007 to 2024, emphasizing a chronological perspective.
The paper categorizes OpenIE approaches into rule-based, neural, and pre-trained large language models, discussing each within a chronological framework.
arXiv Detail & Related papers (2022-08-18T08:03:45Z) - Document AI: Benchmarks, Models and Applications [35.46858492311289]
Document AI refers to the techniques for automatically reading, understanding, and analyzing business documents.
In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI.
This paper briefly reviews some of the representative models, tasks, and benchmark datasets.
arXiv Detail & Related papers (2021-11-16T16:43:07Z) - Document Automation Architectures and Technologies: A Survey [0.0]
This paper surveys the current state of the art in document automation (DA)
The objective of DA is to reduce the manual effort during the generation of documents by automatically integrating input from different sources and assembling documents conforming to defined templates.
There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies.
arXiv Detail & Related papers (2021-09-23T19:12:26Z) - Data-Driven Design-by-Analogy: State of the Art and Future Directions [11.025196033751786]
Design-by- Analogy (DbA) is a design methodology wherein new solutions, opportunities or designs are generated in a target domain based on inspiration drawn from a source domain.
Recently, the increasingly available design databases and rapidly advancing data science and artificial intelligence technologies have presented new opportunities for developing data-driven methods and tools for DbA support.
arXiv Detail & Related papers (2021-06-03T04:35:34Z) - A Survey of Deep Learning Approaches for OCR and Document Understanding [68.65995739708525]
We review different techniques for document understanding for documents written in English.
We consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
arXiv Detail & Related papers (2020-11-27T03:05:59Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Towards Inheritable Models for Open-Set Domain Adaptation [56.930641754944915]
We introduce a practical Domain Adaptation paradigm where a source-trained model is used to facilitate adaptation in the absence of the source dataset in future.
We present an objective way to quantify inheritability to enable the selection of the most suitable source model for a given target domain, even in the absence of the source data.
arXiv Detail & Related papers (2020-04-09T07:16:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.