Related papers: Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

URL: http://arxiv.org/abs/2310.06374v2
Date: Sun, 22 Oct 2023 08:37:43 GMT
Title: Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang
Abstract summary: Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. Seq2seq pre-trained language models (PLMs) have ushered in a transformative era for KPG, yielding promising performance improvements. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG.
Score: 76.52997424694767
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications. The advent of sequence-to-sequence (seq2seq) pre-trained language models (PLMs) has ushered in a transformative era for KPG, yielding promising performance improvements. However, many design decisions remain unexplored and are often made arbitrarily. This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG. We begin by elucidating why seq2seq PLMs are apt for KPG, anchored by an attention-driven hypothesis. We then establish that conventional wisdom for selecting seq2seq PLMs lacks depth: (1) merely increasing model size or performing task-specific adaptation is not parameter-efficient; (2) although combining in-domain pre-training with task adaptation benefits KPG, it does partially hinder generalization. Regarding decoding, we demonstrate that while greedy search achieves strong F1 scores, it lags in recall compared with sampling-based methods. Based on these insights, we propose DeSel, a likelihood-based decode-select algorithm for seq2seq PLMs. DeSel improves greedy search by an average of 4.7% semantic F1 across five datasets. Our collective findings pave the way for deeper future investigations into PLM-based KPG.

Related papers

Deep Reinforcement Learning Algorithms for Option Hedging [0.20482269513546458]
We compare the performance of eight Deep Reinforcement Learning (DRL) algorithms in the context of dynamic hedging. MCPG is the only algorithm to outperform the Black-Scholes delta hedge baseline with the allotted computational budget.
arXiv Detail & Related papers (2025-04-07T21:32:14Z)
Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs. Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries. We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z)
One2set + Large Language Model: Best Partners for Keyphrase Generation [42.969689556605005]
Keyphrase generation (KPG) aims to automatically generate a collection of phrases representing the core concepts of a given document. We introduce a generate-then-select framework decomposing KPG into two steps, where we adopt a one2set-based model as generator to produce candidates and then use an LLM as selector to select keyphrases from these candidates. Our framework significantly surpasses state-of-the-art models, especially in absent keyphrase prediction.
arXiv Detail & Related papers (2024-10-04T13:31:09Z)
Graph-Structured Speculative Decoding [52.94367724136063]
Speculative decoding has emerged as a promising technique to accelerate the inference of Large Language Models. We introduce an innovative approach utilizing a directed acyclic graph (DAG) to manage the drafted hypotheses. We observe a remarkable speedup of 1.73$times$ to 1.96$times$, significantly surpassing standard speculative decoding.
arXiv Detail & Related papers (2024-07-23T06:21:24Z)
On Leveraging Encoder-only Pre-trained Language Models for Effective Keyphrase Generation [76.52997424694767]
This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG) With encoder-only PLMs, although KPE with Conditional Random Fields slightly excels in identifying present keyphrases, the KPG formulation renders a broader spectrum of keyphrase predictions. We also identify a favorable parameter allocation towards model depth rather than width when employing encoder-decoder architectures with encoder-only PLMs.
arXiv Detail & Related papers (2024-02-21T18:57:54Z)
ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning. We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions. Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z)
Ranking-based Adaptive Query Generation for DETRs in Crowded Pedestrian Detection [49.27380156754935]
We find that the number of DETRs' queries must be adjusted manually, otherwise, the performance would degrade to varying degrees. We propose Rank-based Adaptive Query Generation (RAQG) to alleviate the problem. Our method is simple and effective, which can be plugged into any DETRs to make it query-adaptive in theory.
arXiv Detail & Related papers (2023-10-24T11:00:56Z)
Stopping Criteria for Value Iteration on Stochastic Games with Quantitative Objectives [0.0]
A classic solution technique for Markov decision processes (MDP) and games (SG) is value (VI) In this paper, we provide the first stopping criteria for VI on SG with total reward and mean payoff, yielding the first anytime algorithms in these settings.
arXiv Detail & Related papers (2023-04-19T19:09:55Z)
An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering [28.31377197194905]
Large-scale pre-trained language models (PLMs) have recently achieved great success and become a milestone in natural language processing (NLP) In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. We compare the performance of different PLMs in KGQA and present three benchmarks for larger-scale KGs.
arXiv Detail & Related papers (2023-03-18T08:57:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.