KPDrop: An Approach to Improving Absent Keyphrase Generation
- URL: http://arxiv.org/abs/2112.01476v1
- Date: Thu, 2 Dec 2021 18:25:56 GMT
- Title: KPDrop: An Approach to Improving Absent Keyphrase Generation
- Authors: Seoyeon Park, Jishnu Ray Chowdhury, Tuhin Kundu, Cornelia Caragea
- Abstract summary: Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document.
We propose an approach, called keyphrase dropout (or KPDrop) to improve absent keyphrase generation.
- Score: 26.563045686728135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keyphrase generation is the task of generating phrases (keyphrases) that
summarize the main topics of a given document. The generated keyphrases can be
either present or absent from the text of the given document. While the
extraction of present keyphrases has received much attention in the past, only
recently a stronger focus has been placed on the generation of absent
keyphrases. However, generating absent keyphrases is very challenging; even the
best methods show only a modest degree of success. In this paper, we propose an
approach, called keyphrase dropout (or KPDrop), to improve absent keyphrase
generation. We randomly drop present keyphrases from the document and turn them
into artificial absent keyphrases during training. We test our approach
extensively and show that it consistently improves the absent performance of
strong baselines in keyphrase generation.
Related papers
- Enhancing Keyphrase Generation by BART Finetuning with Splitting and
Shuffling [6.276370570422467]
Keyphrase generation is a task of identifying a set of phrases that best repre-sent the main topics or themes of a text.
We propose Keyphrase-Focused BART, which exploits the differ-ences between present and absent keyphrase generations.
For absent keyphrases, our Keyphrase-Focused BART achieved new state-of-the-art score on F1@5 in two out of five keyphrase gen-eration benchmark datasets.
arXiv Detail & Related papers (2023-09-13T05:02:11Z) - Data Augmentation for Low-Resource Keyphrase Generation [46.52115499306222]
Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases)
Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire.
We present data augmentation strategies specifically to address keyphrase generation in purely resource-constrained domains.
arXiv Detail & Related papers (2023-05-29T09:20:34Z) - Applying Transformer-based Text Summarization for Keyphrase Generation [2.28438857884398]
Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
arXiv Detail & Related papers (2022-09-08T13:01:52Z) - Retrieval-Augmented Multilingual Keyphrase Generation with
Retriever-Generator Iterative Training [66.64843711515341]
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text.
We call attention to a new setting named multilingual keyphrase generation.
We propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages.
arXiv Detail & Related papers (2022-05-21T00:45:21Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Unsupervised Deep Keyphrase Generation [14.544869226959612]
Keyphrase generation aims to summarize long documents with a collection of salient phrases.
Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document.
We present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any human annotation.
arXiv Detail & Related papers (2021-04-18T05:53:19Z) - Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness [9.13755431537592]
We discuss the usefulness of absent keyphrases from an Information Retrieval perspective.
We introduce a finer-grained categorization scheme that sheds more light on the impact of absent keyphrases on scientific document retrieval.
arXiv Detail & Related papers (2021-03-23T10:42:18Z) - Persian Keyphrase Generation Using Sequence-to-Sequence Models [1.192436948211501]
Keyphrases are a summary of an input text and provide the main subjects discussed in the text.
In this paper, we try to tackle the problem of keyphrase generation and extraction from news articles using deep sequence-to-sequence models.
arXiv Detail & Related papers (2020-09-25T14:40:14Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.