Neural Keyphrase Generation: Analysis and Evaluation
- URL: http://arxiv.org/abs/2304.13883v1
- Date: Thu, 27 Apr 2023 00:10:21 GMT
- Title: Neural Keyphrase Generation: Analysis and Evaluation
- Authors: Tuhin Kundu, Jishnu Ray Chowdhury, Cornelia Caragea
- Abstract summary: We study various tendencies exhibited by three strong models: T5 (based on a pre-trained transformer), CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a recurrent neural network)
We propose a novel metric framework, SoftKeyScore, to evaluate the similarity between two sets of keyphrases.
- Score: 47.004575377472285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrase generation aims at generating topical phrases from a given text
either by copying from the original text (present keyphrases) or by producing
new keyphrases (absent keyphrases) that capture the semantic meaning of the
text. Encoder-decoder models are most widely used for this task because of
their capabilities for absent keyphrase generation. However, there has been
little to no analysis on the performance and behavior of such models for
keyphrase generation. In this paper, we study various tendencies exhibited by
three strong models: T5 (based on a pre-trained transformer),
CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a
recurrent neural network). We analyze prediction confidence scores, model
calibration, and the effect of token position on keyphrases generation.
Moreover, we motivate and propose a novel metric framework, SoftKeyScore, to
evaluate the similarity between two sets of keyphrases by using softscores to
account for partial matching and semantic similarity. We find that SoftKeyScore
is more suitable than the standard F1 metric for evaluating two sets of given
keyphrases.
Related papers
- SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Towards Better Multi-modal Keyphrase Generation via Visual Entity
Enhancement and Multi-granularity Image Noise Filtering [79.44443231700201]
Multi-modal keyphrase generation aims to produce a set of keyphrases that represent the core points of the input text-image pair.
The input text and image are often not perfectly matched, and thus the image may introduce noise into the model.
We propose a novel multi-modal keyphrase generation model, which not only enriches the model input with external knowledge, but also effectively filters image noise.
arXiv Detail & Related papers (2023-09-09T09:41:36Z) - Applying Transformer-based Text Summarization for Keyphrase Generation [2.28438857884398]
Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
arXiv Detail & Related papers (2022-09-08T13:01:52Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Keyphrase Generation for Scientific Document Retrieval [28.22174864849121]
This study provides empirical evidence that Sequence-to-sequence models can significantly improve document retrieval performance.
We introduce a new extrinsic evaluation framework that allows for a better understanding of the limitations of keyphrase generation models.
arXiv Detail & Related papers (2021-06-28T13:55:49Z) - Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement
Learning [30.09715149060206]
Keyphrase Generation (KG) is a classical task for capturing the central idea from a given document.
In this paper, we propose a new fine-grained evaluation metric that considers different granularity.
For learning more recessive linguistic patterns, we use a pre-trained model (e.g., BERT) to compute the continuous similarity score between predicted keyphrases and target keyphrases.
arXiv Detail & Related papers (2021-04-18T10:13:46Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z) - Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents.
Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks.
In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.