Related papers: Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding

Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding

URL: http://arxiv.org/abs/2305.04460v1
Date: Mon, 8 May 2023 05:03:07 GMT
Title: Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Authors: Bhanu Prakash Voutharoja and Lizhen Qu and Fatemeh Shiri
Abstract summary: We propose a unique entity-relation graph parsing method for scanned forms called LAGNN. Our model parses a form into a word-relation graph in order to identify entities and relations jointly. Our model simply takes into account relative spacing between bounding boxes from layout information to facilitate easy transfer across languages.
Score: 11.042088913869462
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent works on form understanding mostly employ multimodal transformers or large-scale pre-trained language models. These models need ample data for pre-training. In contrast, humans can usually identify key-value pairings from a form only by looking at layouts, even if they don't comprehend the language used. No prior research has been conducted to investigate how helpful layout information alone is for form understanding. Hence, we propose a unique entity-relation graph parsing method for scanned forms called LAGNN, a language-independent Graph Neural Network model. Our model parses a form into a word-relation graph in order to identify entities and relations jointly and reduce the time complexity of inference. This graph is then transformed by deterministic rules into a fully connected entity-relation graph. Our model simply takes into account relative spacing between bounding boxes from layout information to facilitate easy transfer across languages. To further improve the performance of LAGNN, and achieve isomorphism between entity-relation graphs and word-relation graphs, we use integer linear programming (ILP) based inference. Code is publicly available at https://github.com/Bhanu068/LAGNN

Related papers

GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding [39.67113788660731]
We introduce a framework for developing Graph-aligned LAnguage Models (GLaM) We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning.
arXiv Detail & Related papers (2024-02-09T19:53:29Z)
Coreference Graph Guidance for Mind-Map Generation [5.289044688419791]
Recently, a state-of-the-art method encodes the sentences of a document sequentially and converts them to a relation graph via sequence-to-graph. We propose a coreference-guided mind-map generation network (CMGN) to incorporate external structure knowledge.
arXiv Detail & Related papers (2023-12-19T09:39:27Z)
GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models [33.56759621666477]
We present a benchmark dataset for evaluating the integration of graph knowledge into language models. The proposed dataset is designed to evaluate graph-language models' ability to understand graphs and make use of it for answer generation. We perform experiments with language-only models and the proposed graph-language model to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.
arXiv Detail & Related papers (2023-10-12T16:46:58Z)
Conversational Semantic Parsing using Dynamic Context Graphs [68.72121830563906]
We consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types. We focus on models which are capable of interactively mapping user utterances into executable logical forms.
arXiv Detail & Related papers (2023-05-04T16:04:41Z)
Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues. Our method achieves state-of-the-art performance on several MOT datasets. For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning [84.35102534158621]
We study pre-trained language models that generate explanation graphs in an end-to-end manner. We propose simple yet effective ways of graph perturbations via node and edge edit operations. Our methods lead to significant improvements in both structural and semantic accuracy of explanation graphs.
arXiv Detail & Related papers (2022-04-11T00:58:27Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
Visual FUDGE: Form Understanding via Dynamic Graph Editing [2.012425476229879]
The proposed FUDGE model formulates this problem on a graph of text elements. It uses a Graph Convolutional Network to predict changes to the graph. FUDGE is state-of-the-art on the historical NAF dataset.
arXiv Detail & Related papers (2021-05-17T23:18:39Z)
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models. With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow. In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.