D2S: Document-to-Slide Generation Via Query-Based Text Summarization
- URL: http://arxiv.org/abs/2105.03664v1
- Date: Sat, 8 May 2021 10:29:41 GMT
- Title: D2S: Document-to-Slide Generation Via Query-Based Text Summarization
- Authors: Edward Sun, Yufang Hou, Dakuo Wang, Yunfeng Zhang, Nancy X.R. Wang
- Abstract summary: We contribute a new dataset, SciDuet, consisting of pairs of papers and their corresponding slides decks from recent years' NLP and ML conferences.
Secondly, we present D2S, a novel system that tackles the document-to-slides task with a two-step approach.
Our evaluation suggests that long-form QA outperforms state-of-the-art summarization baselines on both automated ROUGE metrics and qualitative human evaluation.
- Score: 27.576875048631265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Presentations are critical for communication in all areas of our lives, yet
the creation of slide decks is often tedious and time-consuming. There has been
limited research aiming to automate the document-to-slides generation process
and all face a critical challenge: no publicly available dataset for training
and benchmarking. In this work, we first contribute a new dataset, SciDuet,
consisting of pairs of papers and their corresponding slides decks from recent
years' NLP and ML conferences (e.g., ACL). Secondly, we present D2S, a novel
system that tackles the document-to-slides task with a two-step approach: 1)
Use slide titles to retrieve relevant and engaging text, figures, and tables;
2) Summarize the retrieved context into bullet points with long-form question
answering. Our evaluation suggests that long-form QA outperforms
state-of-the-art summarization baselines on both automated ROUGE metrics and
qualitative human evaluation.
Related papers
- Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization [48.57273563299046]
We propose the task of Stepwise Summarization, which aims to generate a new appended summary each time a new document is proposed.
The appended summary should not only summarize the newly added content but also be coherent with the previous summary.
We show that SSG achieves state-of-the-art performance in terms of both automatic metrics and human evaluations.
arXiv Detail & Related papers (2024-06-08T05:37:26Z) - The Power of Summary-Source Alignments [62.76959473193149]
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection.
alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data.
This paper proposes extending the summary-source alignment framework by applying it at the more fine-grained proposition span level.
arXiv Detail & Related papers (2024-06-02T19:35:19Z) - Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A
Practical Study [1.933681537640272]
ChatGPT is the latest breakthrough in the field of large language models (LLMs)
We propose a hybrid extraction and summarization pipeline for long documents such as business articles and books.
Our results show that the use of ChatGPT is a very promising but not yet mature approach for summarizing long documents.
arXiv Detail & Related papers (2023-06-01T21:58:33Z) - SQuALITY: Building a Long-Document Summarization Dataset the Hard Way [31.832673451018543]
We hire highly-qualified contractors to read stories and write original summaries from scratch.
To amortize reading time, we collect five summaries per document, with the first giving an overview and the subsequent four addressing specific questions.
Experiments with state-of-the-art summarization systems show that our dataset is challenging and that existing automatic evaluation metrics are weak indicators of quality.
arXiv Detail & Related papers (2022-05-23T17:02:07Z) - Summarization with Graphical Elements [55.5913491389047]
We propose a new task: summarization with graphical elements.
We collect a high quality human labeled dataset to support research into the task.
arXiv Detail & Related papers (2022-04-15T17:16:41Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues
and Documents [13.755637074366813]
SummN is a simple, flexible, and effective multi-stage framework for input texts longer than the maximum context lengths of typical pretrained LMs.
It can process input text of arbitrary length by adjusting the number of stages while keeping the LM context size fixed.
Our experiments demonstrate that SummN significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-16T06:19:54Z) - Document Modeling with Graph Attention Networks for Multi-grained
Machine Reading Comprehension [127.3341842928421]
Natural Questions is a new challenging machine reading comprehension benchmark.
It has two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer)
Existing methods treat these two sub-tasks individually during training while ignoring their dependencies.
We present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature.
arXiv Detail & Related papers (2020-05-12T14:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.