DocSynthv2: A Practical Autoregressive Modeling for Document Generation
- URL: http://arxiv.org/abs/2406.08354v1
- Date: Wed, 12 Jun 2024 16:00:16 GMT
- Title: DocSynthv2: A Practical Autoregressive Modeling for Document Generation
- Authors: Sanket Biswas, Rajiv Jain, Vlad I. Morariu, Jiuxiang Gu, Puneet Mathur, Curtis Wigington, Tong Sun, Josep Lladós,
- Abstract summary: This paper proposes a novel approach called Doc Synthv2 through the development of a simple yet effective autoregressive structured model.
Our model, distinct in its integration of both layout and textual cues, marks a step beyond existing layout-generation approaches.
- Score: 43.84027661517748
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge. This paper delves into this advanced domain, proposing a novel approach called DocSynthv2 through the development of a simple yet effective autoregressive structured model. Our model, distinct in its integration of both layout and textual cues, marks a step beyond existing layout-generation approaches. By focusing on the relationship between the structural elements and the textual content within documents, we aim to generate cohesive and contextually relevant documents without any reliance on visual components. Through experimental studies on our curated benchmark for the new task, we demonstrate the ability of our model combining layout and textual information in enhancing the generation quality and relevance of documents, opening new pathways for research in document creation and automated design. Our findings emphasize the effectiveness of autoregressive models in handling complex document generation tasks.
Related papers
- A Rhetorical Relations-Based Framework for Tailored Multimedia Document Summarization [0.0]
This paper introduces a novel framework for multimedia document summarization.
The framework capitalizes on the inherent structure of the document to craft coherent and succinct summaries.
Weighting algorithms are employed to assign significance values to document units, thereby enabling effective ranking and selection of relevant content.
arXiv Detail & Related papers (2024-12-26T09:29:59Z) - Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction [23.47150047875133]
Document parsing is essential for converting unstructured and semi-structured documents into machine-readable data.
Document parsing plays an indispensable role in both knowledge base construction and training data generation.
This paper discusses the challenges faced by modular document parsing systems and vision-language models in handling complex layouts.
arXiv Detail & Related papers (2024-10-28T16:11:35Z) - Unified Multimodal Interleaved Document Representation for Retrieval [57.65409208879344]
We propose a method that holistically embeds documents interleaved with multiple modalities.
We merge the representations of segmented passages into one single document representation.
We show that our approach substantially outperforms relevant baselines.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - Enhancing Visually-Rich Document Understanding via Layout Structure
Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model.
We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z) - Unifying Vision, Text, and Layout for Universal Document Processing [105.36490575974028]
We propose a Document AI model which unifies text, image, and layout modalities together with varied task formats, including document understanding and generation.
Our method sets the state-of-the-art on 9 Document AI tasks, e.g., document understanding and QA, across diverse data domains like finance reports, academic papers, and websites.
arXiv Detail & Related papers (2022-12-05T22:14:49Z) - Multi-Vector Models with Textual Guidance for Fine-Grained Scientific
Document Similarity [11.157086694203201]
We present a new scientific document similarity model based on matching fine-grained aspects.
Our model is trained using co-citation contexts that describe related paper aspects as a novel form of textual supervision.
arXiv Detail & Related papers (2021-11-16T11:12:30Z) - DocSynth: A Layout Guided Approach for Controllable Document Image
Synthesis [16.284895792639137]
This paper presents a novel approach, called Doc Synth, to automatically synthesize document images based on a given layout.
In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed Doc Synth model learns to generate a set of realistic document images.
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
arXiv Detail & Related papers (2021-07-06T14:24:30Z) - Focused Attention Improves Document-Grounded Generation [111.42360617630669]
Document grounded generation is the task of using the information provided in a document to improve text generation.
This work focuses on two different document grounded generation tasks: Wikipedia Update Generation task and Dialogue response generation.
arXiv Detail & Related papers (2021-04-26T16:56:29Z) - Reasoning with Latent Structure Refinement for Document-Level Relation
Extraction [20.308845516900426]
We propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph.
Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED)
arXiv Detail & Related papers (2020-05-13T13:36:09Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.