Enhancing Keyphrase Generation by BART Finetuning with Splitting and
Shuffling
- URL: http://arxiv.org/abs/2309.06726v1
- Date: Wed, 13 Sep 2023 05:02:11 GMT
- Title: Enhancing Keyphrase Generation by BART Finetuning with Splitting and
Shuffling
- Authors: Bin Chen, Mizuho Iwaihara
- Abstract summary: Keyphrase generation is a task of identifying a set of phrases that best repre-sent the main topics or themes of a text.
We propose Keyphrase-Focused BART, which exploits the differ-ences between present and absent keyphrase generations.
For absent keyphrases, our Keyphrase-Focused BART achieved new state-of-the-art score on F1@5 in two out of five keyphrase gen-eration benchmark datasets.
- Score: 6.276370570422467
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Keyphrase generation is a task of identifying a set of phrases that best
repre-sent the main topics or themes of a given text. Keyphrases are dividend
int pre-sent and absent keyphrases. Recent approaches utilizing
sequence-to-sequence models show effectiveness on absent keyphrase generation.
However, the per-formance is still limited due to the hardness of finding
absent keyphrases. In this paper, we propose Keyphrase-Focused BART, which
exploits the differ-ences between present and absent keyphrase generations, and
performs fine-tuning of two separate BART models for present and absent
keyphrases. We further show effective approaches of shuffling keyphrases and
candidate keyphrase ranking. For absent keyphrases, our Keyphrase-Focused BART
achieved new state-of-the-art score on F1@5 in two out of five keyphrase
gen-eration benchmark datasets.
Related papers
- MetaKP: On-Demand Keyphrase Generation [52.48698290354449]
We introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents.
We present MetaKP, a large-scale benchmark comprising four datasets, 7500 documents, and 3760 goals across news and biomedical domains with human-annotated keyphrases.
We demonstrate the potential of our method to serve as a general NLP infrastructure, exemplified by its application in epidemic event detection from social media.
arXiv Detail & Related papers (2024-06-28T19:02:59Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Applying Transformer-based Text Summarization for Keyphrase Generation [2.28438857884398]
Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
arXiv Detail & Related papers (2022-09-08T13:01:52Z) - Retrieval-Augmented Multilingual Keyphrase Generation with
Retriever-Generator Iterative Training [66.64843711515341]
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text.
We call attention to a new setting named multilingual keyphrase generation.
We propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages.
arXiv Detail & Related papers (2022-05-21T00:45:21Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - KPDrop: An Approach to Improving Absent Keyphrase Generation [26.563045686728135]
Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document.
We propose an approach, called keyphrase dropout (or KPDrop) to improve absent keyphrase generation.
arXiv Detail & Related papers (2021-12-02T18:25:56Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness [9.13755431537592]
We discuss the usefulness of absent keyphrases from an Information Retrieval perspective.
We introduce a finer-grained categorization scheme that sheds more light on the impact of absent keyphrases on scientific document retrieval.
arXiv Detail & Related papers (2021-03-23T10:42:18Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Keyphrase Prediction With Pre-trained Language Model [16.06425973336514]
We propose to divide the keyphrase prediction into two subtasks, i.e., present keyphrase extraction (PKE) and absent keyphrase generation (AKG)
For PKE, we tackle this task as a sequence labeling problem with the pre-trained language model BERT.
For AKG, we introduce a Transformer-based architecture, which fully integrates the present keyphrase knowledge learned from PKE by the fine-tuned BERT.
arXiv Detail & Related papers (2020-04-22T09:35:02Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.