Modelling the semantics of text in complex document layouts using graph
transformer networks
- URL: http://arxiv.org/abs/2202.09144v1
- Date: Fri, 18 Feb 2022 11:49:06 GMT
- Title: Modelling the semantics of text in complex document layouts using graph
transformer networks
- Authors: Thomas Roland Barillot (1), Jacob Saks (1), Polena Lilyanova (1),
Edward Torgas (1), Yachen Hu (1), Yuanqing Liu (1), Varun Balupuri (1) and
Paul Gaskell (1) ((1) BlackRock Inc.)
- Abstract summary: We propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span.
We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Representing structured text from complex documents typically calls for
different machine learning techniques, such as language models for paragraphs
and convolutional neural networks (CNNs) for table extraction, which prohibits
drawing links between text spans from different content types. In this article
we propose a model that approximates the human reading pattern of a document
and outputs a unique semantic representation for every text span irrespective
of the content type they are found in. We base our architecture on a graph
representation of the structured text, and we demonstrate that not only can we
retrieve semantically similar information across documents but also that the
embedding space we generate captures useful semantic information, similar to
language models that work only on text sequences.
Related papers
- factgenie: A Framework for Span-based Evaluation of Generated Texts [1.6864244598342872]
s can capture various span-based phenomena such as semantic inaccuracies or irrelevant text.
Our framework consists of a web interface for data visualization and gathering text annotations.
arXiv Detail & Related papers (2024-07-25T08:33:23Z) - Patton: Language Model Pretraining on Text-Rich Networks [33.914163727649466]
We propose PretrAining on TexT-Rich NetwOrk framework Patton for text-rich networks.
Patton includes two pretraining strategies: network-contextualized masked language modeling and masked node prediction.
We conduct experiments on four downstream tasks in five datasets from both academic and e-commerce domains.
arXiv Detail & Related papers (2023-05-20T19:17:10Z) - WordStylist: Styled Verbatim Handwritten Text Generation with Latent
Diffusion Models [8.334487584550185]
We present a latent diffusion-based method for styled text-to-text-content-image generation on word-level.
Our proposed method is able to generate realistic word image samples from different writer styles.
We show that the proposed model produces samples that are aesthetically pleasing, help boosting text recognition performance, and get similar writer retrieval score as real data.
arXiv Detail & Related papers (2023-03-29T10:19:26Z) - Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text.
Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z) - Full Page Handwriting Recognition via Image to Sequence Extraction [0.0]
The model achieves a new state-of-art in full page recognition on the IAM dataset.
It is deployed in production as part of a commercial web application.
arXiv Detail & Related papers (2021-03-11T04:37:29Z) - Minimally-Supervised Structure-Rich Text Categorization via Learning on
Text-Rich Networks [61.23408995934415]
We propose a novel framework for minimally supervised categorization by learning from the text-rich network.
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
Our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%.
arXiv Detail & Related papers (2021-02-23T04:14:34Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z) - A Multi-Perspective Architecture for Semantic Code Search [58.73778219645548]
We propose a novel multi-perspective cross-lingual neural framework for code--text matching.
Our experiments on the CoNaLa dataset show that our proposed model yields better performance than previous approaches.
arXiv Detail & Related papers (2020-05-06T04:46:11Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.