Neural Content Extraction for Poster Generation of Scientific Papers
- URL: http://arxiv.org/abs/2112.08550v1
- Date: Thu, 16 Dec 2021 01:19:37 GMT
- Title: Neural Content Extraction for Poster Generation of Scientific Papers
- Authors: Sheng Xu, Xiaojun Wan
- Abstract summary: The problem of poster generation for scientific papers is under-investigated.
Previous studies focus mainly on poster layout and panel composition, while neglecting the importance of content extraction.
To get both textual and visual elements of a poster panel, a neural extractive model is proposed to extract text, figures and tables of a paper section simultaneously.
- Score: 84.30128728027375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of poster generation for scientific papers is under-investigated.
Posters often present the most important information of papers, and the task
can be considered as a special form of document summarization. Previous studies
focus mainly on poster layout and panel composition, while neglecting the
importance of content extraction. Besides, their datasets are not publicly
available, which hinders further research. In this paper, we construct a
benchmark dataset from scratch for this task. Then we propose a three-step
framework to tackle this task and focus on the content extraction step in this
study. To get both textual and visual elements of a poster panel, a neural
extractive model is proposed to extract text, figures and tables of a paper
section simultaneously. We conduct experiments on the dataset and also perform
ablation study. Results demonstrate the efficacy of our proposed model. The
dataset and code will be released.
Related papers
- Interactive Distillation of Large Single-Topic Corpora of Scientific
Papers [1.2954493726326113]
A more robust but time-consuming approach is to build the dataset constructively in which a subject matter expert handpicks documents.
Here we showcase a new tool, based on machine learning, for constructively generating targeted datasets of scientific literature.
arXiv Detail & Related papers (2023-09-19T17:18:36Z) - Automated Feedback Generation for a Chemistry Database and Abstracting
Exercise [0.0]
The dataset contained 207 submissions from two consecutive years of the course, summarising a total of 21 different papers from the primary literature.
The model was pre-trained using an available dataset (approx. 15,000 samples) and then fine-tuned on 80% of the submitted dataset.
The sentences in the student submissions are characterised into three classes - background, technique and observation.
arXiv Detail & Related papers (2023-05-22T15:04:26Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Topic Modeling Based Extractive Text Summarization [0.0]
We propose a novel method to summarize a text document by clustering its contents based on latent topics.
We utilize the lesser used and challenging WikiHow dataset in our approach to text summarization.
arXiv Detail & Related papers (2021-06-29T12:28:19Z) - DOC2PPT: Automatic Presentation Slides Generation from Scientific
Documents [76.19748112897177]
We present a novel task and approach for document-to-slide generation.
We propose a hierarchical sequence-to-sequence approach to tackle our task in an end-to-end manner.
Our approach exploits the inherent structures within documents and slides and incorporates paraphrasing and layout prediction modules to generate slides.
arXiv Detail & Related papers (2021-01-28T03:21:17Z) - Learning to Emphasize: Dataset and Shared Task Models for Selecting
Emphasis in Presentation Slides [31.540208729354354]
Emphasizing strong leading words in presentation slides can allow the audience to direct the eye to certain focal points instead of reading the entire slide.
Motivated by this demand, we study the problem of Emphasis Selection (ES) in presentation slides.
We introduce a new dataset containing presentation slides with a wide variety of topics, each is annotated with emphasis words in a crowdsourced setting.
arXiv Detail & Related papers (2021-01-02T06:54:55Z) - Enhancing Extractive Text Summarization with Topic-Aware Graph Neural
Networks [21.379555672973975]
This paper proposes a graph neural network (GNN)-based extractive summarization model.
Our model integrates a joint neural topic model (NTM) to discover latent topics, which can provide document-level features for sentence selection.
The experimental results demonstrate that our model achieves substantially state-of-the-art results on CNN/DM and NYT datasets.
arXiv Detail & Related papers (2020-10-13T09:30:04Z) - From Standard Summarization to New Tasks and Beyond: Summarization with
Manifold Information [77.89755281215079]
Text summarization is the research area aiming at creating a short and condensed version of the original document.
In real-world applications, most of the data is not in a plain text format.
This paper focuses on the survey of these new summarization tasks and approaches in the real-world application.
arXiv Detail & Related papers (2020-05-10T14:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.