An Analysis of Datasets, Metrics and Models in Keyphrase Generation
- URL: http://arxiv.org/abs/2506.10346v1
- Date: Thu, 12 Jun 2025 04:54:44 GMT
- Title: An Analysis of Datasets, Metrics and Models in Keyphrase Generation
- Authors: Florian Boudin, Akiko Aizawa,
- Abstract summary: Keyphrase generation refers to the task of producing a set of words or phrases that summarise a document.<n>We present an analysis of over 50 research papers on keyphrase generation, offering a comprehensive overview of recent progress, limitations, and open challenges.
- Score: 33.04325179283727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrase generation refers to the task of producing a set of words or phrases that summarises the content of a document. Continuous efforts have been dedicated to this task over the past few years, spreading across multiple lines of research, such as model architectures, data resources, and use-case scenarios. Yet, the current state of keyphrase generation remains unknown as there has been no attempt to review and analyse previous work. In this paper, we bridge this gap by presenting an analysis of over 50 research papers on keyphrase generation, offering a comprehensive overview of recent progress, limitations, and open challenges. Our findings highlight several critical issues in current evaluation practices, such as the concerning similarity among commonly-used benchmark datasets and inconsistencies in metric calculations leading to overestimated performances. Additionally, we address the limited availability of pre-trained models by releasing a strong PLM-based model for keyphrase generation as an effort to facilitate future research.
Related papers
- ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation [21.10770048637475]
We propose ERU-KG, an unsupervised keyphrase generation (UKG) model that consists of an informativeness and a phraseness module.<n>ERU-KG demonstrates its effectiveness on keyphrase generation benchmarks by outperforming unsupervised baselines and achieving on average 89% of the performance of a supervised model for top 10 predictions.
arXiv Detail & Related papers (2025-05-30T05:09:53Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Quantifying Contamination in Evaluating Code Generation Capabilities of
Language Models [27.24738197172374]
Large language models have achieved remarkable performance on various code generation benchmarks.
There have been growing concerns regarding potential contamination of these benchmarks as they may be leaked into pretraining and finetuning data.
We show that there are substantial overlap between popular code generation benchmarks and open training corpus, and models perform significantly better on the subset of the benchmarks where similar solutions are seen during training.
arXiv Detail & Related papers (2024-03-06T21:45:35Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - From Statistical Methods to Deep Learning, Automatic Keyphrase
Prediction: A Survey [44.83902003341381]
Keyphrase prediction aims to generate phrases (keyphrases) that highly summarizes a given document.
Recently, researchers have conducted in-depth studies on this task from various perspectives.
Our work analyzes up to 167 previous works, achieving greater coverage of this task than previous surveys.
arXiv Detail & Related papers (2023-05-04T06:22:50Z) - Next-Year Bankruptcy Prediction from Textual Data: Benchmark and
Baselines [10.944533132358439]
Models for bankruptcy prediction are useful in several real-world scenarios.
The lack of a common benchmark dataset and evaluation strategy impedes the objective comparison between models.
This paper introduces such a benchmark for the unstructured data scenario, based on novel and established datasets.
arXiv Detail & Related papers (2022-08-24T07:11:49Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Keyphrase Generation for Scientific Document Retrieval [28.22174864849121]
This study provides empirical evidence that Sequence-to-sequence models can significantly improve document retrieval performance.
We introduce a new extrinsic evaluation framework that allows for a better understanding of the limitations of keyphrase generation models.
arXiv Detail & Related papers (2021-06-28T13:55:49Z) - Author Clustering and Topic Estimation for Short Texts [69.54017251622211]
We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document.
We also simultaneously cluster users, removing the need for post-hoc cluster estimation.
Our method performs as well as -- or better -- than traditional approaches to problems arising in short text.
arXiv Detail & Related papers (2021-06-15T20:55:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.