Semantic Parsing of Interpage Relations
- URL: http://arxiv.org/abs/2205.13530v1
- Date: Thu, 26 May 2022 17:50:43 GMT
- Title: Semantic Parsing of Interpage Relations
- Authors: Mehmet Arif Demirta\c{s}, Berke Oral, Mehmet Yasin Akp{\i}nar, Onur
Deniz
- Abstract summary: We formalize the task as semantic parsing of interpage relations and we propose an end-to-end approach for interpage dependency extraction.
We also design a multi-task training approach to jointly optimize for page embeddings to be used in segmentation, classification, and parsing of the page dependencies.
Our experimental results show that the proposed method increased LAS by 41 percentage points for semantic parsing, increased accuracy by 33 percentage points for page stream segmentation, and 45 percentage points for page classification over a naive baseline.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Page-level analysis of documents has been a topic of interest in digitization
efforts, and multimodal approaches have been applied to both classification and
page stream segmentation. In this work, we focus on capturing finer semantic
relations between pages of a multi-page document. To this end, we formalize the
task as semantic parsing of interpage relations and we propose an end-to-end
approach for interpage dependency extraction, inspired by the dependency
parsing literature. We further design a multi-task training approach to jointly
optimize for page embeddings to be used in segmentation, classification, and
parsing of the page dependencies using textual and visual features extracted
from the pages. Moreover, we also combine the features from two modalities to
obtain multimodal page embeddings. To the best of our knowledge, this is the
first study to extract rich semantic interpage relations from multi-page
documents. Our experimental results show that the proposed method increased LAS
by 41 percentage points for semantic parsing, increased accuracy by 33
percentage points for page stream segmentation, and 45 percentage points for
page classification over a naive baseline.
Related papers
- Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities.
Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Page Segmentation using Visual Adjacency Analysis [5.9521013526545925]
We propose a novel page segmentation approach based on visual analysis of localized adjacency regions.
It combines DOM attributes and visual analysis to build features of a given page and guide an unsupervised clustering.
We evaluate our approach on 35 real-world web pages, and examine the effectiveness and efficiency of segmentation.
arXiv Detail & Related papers (2021-12-11T00:20:30Z) - Modeling Endorsement for Multi-Document Abstractive Summarization [10.166639983949887]
A crucial difference between single- and multi-document summarization is how salient content manifests itself in the document(s)
In this paper, we model the cross-document endorsement effect and its utilization in multiple document summarization.
Our method generates a synopsis from each document, which serves as an endorser to identify salient content from other documents.
arXiv Detail & Related papers (2021-10-15T03:55:42Z) - iFacetSum: Coreference-based Interactive Faceted Summarization for
Multi-Document Exploration [63.272359227081836]
iFacetSum integrates interactive summarization together with faceted search.
Fine-grained facets are automatically produced based on cross-document coreference pipelines.
arXiv Detail & Related papers (2021-09-23T20:01:11Z) - Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning [50.08729005865331]
This paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework.
To capture the correlations between the image and text at multiple levels of abstraction, we design a variational inference network.
To guide the paragraph generation, the learned hierarchical topics and visual features are integrated into the language model.
arXiv Detail & Related papers (2021-05-10T06:55:39Z) - Topical Change Detection in Documents via Embeddings of Long Sequences [4.13878392637062]
We formulate the task of text segmentation as an independent supervised prediction task.
By fine-tuning on paragraphs of similar sections, we are able to show that learned features encode topic information.
Unlike previous approaches, which mostly operate on sentence-level, we consistently use a broader context.
arXiv Detail & Related papers (2020-12-07T12:09:37Z) - WikiAsp: A Dataset for Multi-domain Aspect-based Summarization [69.13865812754058]
We propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization.
Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation.
Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.
arXiv Detail & Related papers (2020-11-16T10:02:52Z) - An Evaluation of DNN Architectures for Page Segmentation of Historical
Newspapers [0.0]
We evaluate 11 different published Deep Neural Networks backbone architectures and 9 different tiling and scaling configurations for separating text, tables or table column lines.
We show the influence of the number of labels and the number of training pages on the segmentation quality, which we measure using the Matthews Correlation Coefficient.
Our results show that (depending on the task) Inception-ResNet-v2 and EfficientNet backbones work best, vertical tiling is generally preferable to other tiling approaches.
arXiv Detail & Related papers (2020-04-15T20:05:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.