Document Automation Architectures: Updated Survey in Light of Large
Language Models
- URL: http://arxiv.org/abs/2308.09341v1
- Date: Fri, 18 Aug 2023 06:59:55 GMT
- Title: Document Automation Architectures: Updated Survey in Light of Large
Language Models
- Authors: Mohammad Ahmadi Achachlouei, Omkar Patil, Tarun Joshi, Vijayan N. Nair
- Abstract summary: This paper surveys the current state of the art in document automation (DA)
The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates.
There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies.
- Score: 2.990411348977783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper surveys the current state of the art in document automation (DA).
The objective of DA is to reduce the manual effort during the generation of
documents by automatically creating and integrating input from different
sources and assembling documents conforming to defined templates. There have
been reviews of commercial solutions of DA, particularly in the legal domain,
but to date there has been no comprehensive review of the academic research on
DA architectures and technologies. The current survey of DA reviews the
academic literature and provides a clearer definition and characterization of
DA and its features, identifies state-of-the-art DA architectures and
technologies in academic research, and provides ideas that can lead to new
research opportunities within the DA field in light of recent advances in
generative AI and large language models.
Related papers
- A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation.
We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs)
We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z) - A Survey of Research in Large Language Models for Electronic Design Automation [5.426530967206322]
Large Language Models (LLMs) have emerged as transformative technologies.
This survey focuses on advancements in model architectures, the implications of varying model sizes, and innovative customization techniques.
It aims to offer valuable insights to professionals in the EDA industry, AI researchers, and anyone interested in the convergence of advanced AI technologies and electronic design.
arXiv Detail & Related papers (2025-01-16T16:51:59Z) - Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - Document Understanding Dataset and Evaluation (DUDE) [29.78902147806488]
Document Understanding dataset and evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs)
We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins, and dates.
arXiv Detail & Related papers (2023-05-15T08:54:32Z) - Deep learning for table detection and structure recognition: A survey [49.09628624903334]
The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
arXiv Detail & Related papers (2022-11-15T19:42:27Z) - Document AI: Benchmarks, Models and Applications [35.46858492311289]
Document AI refers to the techniques for automatically reading, understanding, and analyzing business documents.
In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI.
This paper briefly reviews some of the representative models, tasks, and benchmark datasets.
arXiv Detail & Related papers (2021-11-16T16:43:07Z) - Document Automation Architectures and Technologies: A Survey [0.0]
This paper surveys the current state of the art in document automation (DA)
The objective of DA is to reduce the manual effort during the generation of documents by automatically integrating input from different sources and assembling documents conforming to defined templates.
There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies.
arXiv Detail & Related papers (2021-09-23T19:12:26Z) - Data-Driven Design-by-Analogy: State of the Art and Future Directions [11.025196033751786]
Design-by- Analogy (DbA) is a design methodology wherein new solutions, opportunities or designs are generated in a target domain based on inspiration drawn from a source domain.
Recently, the increasingly available design databases and rapidly advancing data science and artificial intelligence technologies have presented new opportunities for developing data-driven methods and tools for DbA support.
arXiv Detail & Related papers (2021-06-03T04:35:34Z) - A Survey of Deep Learning Approaches for OCR and Document Understanding [68.65995739708525]
We review different techniques for document understanding for documents written in English.
We consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
arXiv Detail & Related papers (2020-11-27T03:05:59Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Towards Inheritable Models for Open-Set Domain Adaptation [56.930641754944915]
We introduce a practical Domain Adaptation paradigm where a source-trained model is used to facilitate adaptation in the absence of the source dataset in future.
We present an objective way to quantify inheritability to enable the selection of the most suitable source model for a given target domain, even in the absence of the source data.
arXiv Detail & Related papers (2020-04-09T07:16:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.