Applying Transformer-based Text Summarization for Keyphrase Generation
- URL: http://arxiv.org/abs/2209.03791v1
- Date: Thu, 8 Sep 2022 13:01:52 GMT
- Title: Applying Transformer-based Text Summarization for Keyphrase Generation
- Authors: Anna Glazkova and Dmitry Morozov
- Abstract summary: Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
- Score: 2.28438857884398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrases are crucial for searching and systematizing scholarly documents.
Most current methods for keyphrase extraction are aimed at the extraction of
the most significant words in the text. But in practice, the list of keyphrases
often includes words that do not appear in the text explicitly. In this case,
the list of keyphrases represents an abstractive summary of the source text. In
this paper, we experiment with popular transformer-based models for abstractive
text summarization using four benchmark datasets for keyphrase extraction. We
compare the results obtained with the results of common unsupervised and
supervised methods for keyphrase extraction. Our evaluation shows that
summarization models are quite effective in generating keyphrases in the terms
of the full-match F1-score and BERTScore. However, they produce a lot of words
that are absent in the author's list of keyphrases, which makes summarization
models ineffective in terms of ROUGE-1. We also investigate several ordering
strategies to concatenate target keyphrases. The results showed that the choice
of strategy affects the performance of keyphrase generation.
Related papers
- Enhancing Automatic Keyphrase Labelling with Text-to-Text Transfer Transformer (T5) Architecture: A Framework for Keyphrase Generation and Filtering [2.1656586298989793]
This paper presents a keyphrase generation model based on the Text-to-Text Transfer Transformer (T5) architecture.
We also present a novel keyphrase filtering technique based on the T5 architecture.
arXiv Detail & Related papers (2024-09-25T09:16:46Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - EntropyRank: Unsupervised Keyphrase Extraction via Side-Information
Optimization for Language Model-based Text Compression [62.261476176242724]
We propose an unsupervised method to extract keywords and keyphrases from texts based on a pre-trained language model (LM) and Shannon's information.
Specifically, our method extracts phrases having the highest conditional entropy under the LM.
arXiv Detail & Related papers (2023-08-25T14:23:40Z) - Neural Keyphrase Generation: Analysis and Evaluation [47.004575377472285]
We study various tendencies exhibited by three strong models: T5 (based on a pre-trained transformer), CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a recurrent neural network)
We propose a novel metric framework, SoftKeyScore, to evaluate the similarity between two sets of keyphrases.
arXiv Detail & Related papers (2023-04-27T00:10:21Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Phraseformer: Multimodal Key-phrase Extraction using Transformer and
Graph Embedding [3.7110020502717616]
We develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques.
In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations.
We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score.
arXiv Detail & Related papers (2021-06-09T09:32:17Z) - Persian Keyphrase Generation Using Sequence-to-Sequence Models [1.192436948211501]
Keyphrases are a summary of an input text and provide the main subjects discussed in the text.
In this paper, we try to tackle the problem of keyphrase generation and extraction from news articles using deep sequence-to-sequence models.
arXiv Detail & Related papers (2020-09-25T14:40:14Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z) - Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents.
Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks.
In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.