Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News
Multi-Headline Generation
- URL: http://arxiv.org/abs/2004.03875v2
- Date: Sun, 4 Oct 2020 03:02:07 GMT
- Title: Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News
Multi-Headline Generation
- Authors: Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin
Jiang, Jiancheng Lv, Nan Duan
- Abstract summary: We propose generating multiple headlines with keyphrases of user interests.
The proposed method achieves state-of-the-art results in terms of quality and diversity.
- Score: 98.98411895250774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: News headline generation aims to produce a short sentence to attract readers
to read the news. One news article often contains multiple keyphrases that are
of interest to different users, which can naturally have multiple reasonable
headlines. However, most existing methods focus on the single headline
generation. In this paper, we propose generating multiple headlines with
keyphrases of user interests, whose main idea is to generate multiple
keyphrases of interest to users for the news first, and then generate multiple
keyphrase-relevant headlines. We propose a multi-source Transformer decoder,
which takes three sources as inputs: (a) keyphrase, (b) keyphrase-filtered
article, and (c) original article to generate keyphrase-relevant, high-quality,
and diverse headlines. Furthermore, we propose a simple and effective method to
mine the keyphrases of interest in the news article and build a first
large-scale keyphrase-aware news headline corpus, which contains over 180K
aligned triples of $<$news article, headline, keyphrase$>$. Extensive
experimental comparisons on the real-world dataset show that the proposed
method achieves state-of-the-art results in terms of quality and diversity
Related papers
- Data Augmentation for Low-Resource Keyphrase Generation [46.52115499306222]
Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases)
Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire.
We present data augmentation strategies specifically to address keyphrase generation in purely resource-constrained domains.
arXiv Detail & Related papers (2023-05-29T09:20:34Z) - Retrieval-Augmented Multilingual Keyphrase Generation with
Retriever-Generator Iterative Training [66.64843711515341]
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text.
We call attention to a new setting named multilingual keyphrase generation.
We propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages.
arXiv Detail & Related papers (2022-05-21T00:45:21Z) - Keyphrase Generation Beyond the Boundaries of Title and Abstract [28.56508031460787]
Keyphrase generation aims at generating phrases (keyphrases) that best describe a given document.
In this work, we explore whether the integration of additional data from semantically similar articles or from the full text of the given article can be helpful for a neural keyphrase generation model.
We discover that adding sentences from the full text particularly in the form of summary of the article can significantly improve the generation of both types of keyphrases.
arXiv Detail & Related papers (2021-12-13T16:33:01Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase
Generation Approach [97.38622477085188]
We propose BTmPG (Back-Translation guided multi-round Paraphrase Generation) to improve diversity of paraphrase.
We evaluate BTmPG on two benchmark datasets.
arXiv Detail & Related papers (2021-09-04T13:12:01Z) - DeepTitle -- Leveraging BERT to generate Search Engine Optimized
Headlines [0.0]
We showcase how a pre-trained language model can be leveraged to create an abstractive news headline generator for German language.
We incorporate state of the art fine-tuning techniques for abstractive text summarization, i.e. we use different baits for the encoder and decoder.
We conduct experiments on a German news data set and achieve a ROUGE-L-gram F-score of 40.02.
arXiv Detail & Related papers (2021-07-22T21:32:54Z) - Persian Keyphrase Generation Using Sequence-to-Sequence Models [1.192436948211501]
Keyphrases are a summary of an input text and provide the main subjects discussed in the text.
In this paper, we try to tackle the problem of keyphrase generation and extraction from news articles using deep sequence-to-sequence models.
arXiv Detail & Related papers (2020-09-25T14:40:14Z) - PerKey: A Persian News Corpus for Keyphrase Extraction and Generation [1.192436948211501]
PerKey is a corpus of 553k news articles from six Persian news websites and agencies with relatively high quality author extracted keyphrases.
The data was put into human assessment to ensure the quality of the keyphrases.
arXiv Detail & Related papers (2020-09-25T14:36:41Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.