Doc2Graph: a Task Agnostic Document Understanding Framework based on
Graph Neural Networks
- URL: http://arxiv.org/abs/2208.11168v1
- Date: Tue, 23 Aug 2022 19:48:10 GMT
- Title: Doc2Graph: a Task Agnostic Document Understanding Framework based on
Graph Neural Networks
- Authors: Andrea Gemelli and Sanket Biswas and Enrico Civitelli and Josep
Llad\'os and Simone Marinai
- Abstract summary: We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model.
We evaluate our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.
- Score: 0.965964228590342
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Geometric Deep Learning has recently attracted significant interest in a wide
range of machine learning fields, including document analysis. The application
of Graph Neural Networks (GNNs) has become crucial in various document-related
tasks since they can unravel important structural patterns, fundamental in key
information extraction processes. Previous works in the literature propose
task-driven models and do not take into account the full power of graphs. We
propose Doc2Graph, a task-agnostic document understanding framework based on a
GNN model, to solve different tasks given different types of documents. We
evaluated our approach on two challenging datasets for key information
extraction in form understanding, invoice layout analysis and table detection.
Our code is freely accessible on https://github.com/andreagemelli/doc2graph.
Related papers
- iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models [0.7165255458140439]
iText2KG is a method for incremental, topic-independent Knowledge Graph construction without post-processing.
Our method demonstrates superior performance compared to baseline methods across three scenarios.
arXiv Detail & Related papers (2024-09-05T06:49:14Z) - Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models [11.959445364035734]
80% of enterprise data reside in unstructured files, stored in data lakes that accommodate heterogeneous formats.
We introduce Docs2KG, a novel framework designed to extract multimodal information from diverse and heterogeneous documents.
Docs2KG generates a unified knowledge graph that represents the extracted key information.
arXiv Detail & Related papers (2024-06-05T05:35:59Z) - DocGraphLM: Documental Graph Language Model for Information Extraction [15.649726614383388]
We introduce DocGraphLM, a framework that combines pre-trained language models with graph semantics.
To achieve this, we propose 1) a joint encoder architecture to represent documents, and 2) a novel link prediction approach to reconstruct document graphs.
Our experiments on three SotA datasets show consistent improvement on IE and QA tasks with the adoption of graph features.
arXiv Detail & Related papers (2024-01-05T14:15:36Z) - One for All: Towards Training One Graph Model for All Classification Tasks [61.656962278497225]
A unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain.
We propose textbfOne for All (OFA), the first general framework that can use a single graph model to address the above challenges.
OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.
arXiv Detail & Related papers (2023-09-29T21:15:26Z) - Neural Graph Reasoning: Complex Logical Query Answering Meets Graph
Databases [63.96793270418793]
Complex logical query answering (CLQA) is a recently emerged task of graph machine learning.
We introduce the concept of Neural Graph Database (NGDBs)
NGDB consists of a Neural Graph Storage and a Neural Graph Engine.
arXiv Detail & Related papers (2023-03-26T04:03:37Z) - GRATIS: Deep Learning Graph Representation with Task-specific Topology
and Multi-dimensional Edge Features [27.84193444151138]
We propose the first general graph representation learning framework (called GRATIS)
It can generate a strong graph representation with a task-specific topology and task-specific multi-dimensional edge features from any arbitrary input.
Our framework is effective, robust and flexible, and is a plug-and-play module that can be combined with different backbones and Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2022-11-19T18:42:55Z) - Arch-Graph: Acyclic Architecture Relation Predictor for
Task-Transferable Neural Architecture Search [96.31315520244605]
Arch-Graph is a transferable NAS method that predicts task-specific optimal architectures.
We show Arch-Graph's transferability and high sample efficiency across numerous tasks.
It is able to find top 0.16% and 0.29% architectures on average on two search spaces under the budget of only 50 models.
arXiv Detail & Related papers (2022-04-12T16:46:06Z) - Multimodal Pre-training Based on Graph Attention Network for Document
Understanding [32.55734039518983]
GraphDoc is a graph-based model for various document understanding tasks.
It is pre-trained in a multimodal framework by utilizing text, layout, and image information simultaneously.
It learns a generic representation from only 320k unlabeled documents.
arXiv Detail & Related papers (2022-03-25T09:27:50Z) - Extracting Summary Knowledge Graphs from Long Documents [48.92130466606231]
We introduce a new text-to-graph task of predicting summarized knowledge graphs from long documents.
We develop a dataset of 200k document/graph pairs using automatic and human annotations.
arXiv Detail & Related papers (2020-09-19T04:37:33Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z) - Semantic Graphs for Generating Deep Questions [98.5161888878238]
We propose a novel framework which first constructs a semantic-level graph for the input document and then encodes the semantic graph by introducing an attention-based GGNN (Att-GGNN)
On the HotpotQA deep-question centric dataset, our model greatly improves performance over questions requiring reasoning over multiple facts, leading to state-of-the-art performance.
arXiv Detail & Related papers (2020-04-27T10:52:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.