Rethinking Model Selection and Decoding for Keyphrase Generation with
Pre-trained Sequence-to-Sequence Models
- URL: http://arxiv.org/abs/2310.06374v2
- Date: Sun, 22 Oct 2023 08:37:43 GMT
- Title: Rethinking Model Selection and Decoding for Keyphrase Generation with
Pre-trained Sequence-to-Sequence Models
- Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang
- Abstract summary: Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications.
Seq2seq pre-trained language models (PLMs) have ushered in a transformative era for KPG, yielding promising performance improvements.
This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG.
- Score: 76.52997424694767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrase Generation (KPG) is a longstanding task in NLP with widespread
applications. The advent of sequence-to-sequence (seq2seq) pre-trained language
models (PLMs) has ushered in a transformative era for KPG, yielding promising
performance improvements. However, many design decisions remain unexplored and
are often made arbitrarily. This paper undertakes a systematic analysis of the
influence of model selection and decoding strategies on PLM-based KPG. We begin
by elucidating why seq2seq PLMs are apt for KPG, anchored by an
attention-driven hypothesis. We then establish that conventional wisdom for
selecting seq2seq PLMs lacks depth: (1) merely increasing model size or
performing task-specific adaptation is not parameter-efficient; (2) although
combining in-domain pre-training with task adaptation benefits KPG, it does
partially hinder generalization. Regarding decoding, we demonstrate that while
greedy search achieves strong F1 scores, it lags in recall compared with
sampling-based methods. Based on these insights, we propose DeSel, a
likelihood-based decode-select algorithm for seq2seq PLMs. DeSel improves
greedy search by an average of 4.7% semantic F1 across five datasets. Our
collective findings pave the way for deeper future investigations into
PLM-based KPG.
Related papers
- Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs.
Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries.
We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z) - One2set + Large Language Model: Best Partners for Keyphrase Generation [42.969689556605005]
Keyphrase generation (KPG) aims to automatically generate a collection of phrases representing the core concepts of a given document.
We introduce a generate-then-select framework decomposing KPG into two steps, where we adopt a one2set-based model as generator to produce candidates and then use an LLM as selector to select keyphrases from these candidates.
Our framework significantly surpasses state-of-the-art models, especially in absent keyphrase prediction.
arXiv Detail & Related papers (2024-10-04T13:31:09Z) - Graph-Structured Speculative Decoding [52.94367724136063]
Speculative decoding has emerged as a promising technique to accelerate the inference of Large Language Models.
We introduce an innovative approach utilizing a directed acyclic graph (DAG) to manage the drafted hypotheses.
We observe a remarkable speedup of 1.73$times$ to 1.96$times$, significantly surpassing standard speculative decoding.
arXiv Detail & Related papers (2024-07-23T06:21:24Z) - On Leveraging Encoder-only Pre-trained Language Models for Effective
Keyphrase Generation [76.52997424694767]
This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG)
With encoder-only PLMs, although KPE with Conditional Random Fields slightly excels in identifying present keyphrases, the KPG formulation renders a broader spectrum of keyphrase predictions.
We also identify a favorable parameter allocation towards model depth rather than width when employing encoder-decoder architectures with encoder-only PLMs.
arXiv Detail & Related papers (2024-02-21T18:57:54Z) - ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained
Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning.
We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions.
Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z) - Ranking-based Adaptive Query Generation for DETRs in Crowded Pedestrian
Detection [49.27380156754935]
We find that the number of DETRs' queries must be adjusted manually, otherwise, the performance would degrade to varying degrees.
We propose Rank-based Adaptive Query Generation (RAQG) to alleviate the problem.
Our method is simple and effective, which can be plugged into any DETRs to make it query-adaptive in theory.
arXiv Detail & Related papers (2023-10-24T11:00:56Z) - Stopping Criteria for Value Iteration on Stochastic Games with
Quantitative Objectives [0.0]
A classic solution technique for Markov decision processes (MDP) and games (SG) is value (VI)
In this paper, we provide the first stopping criteria for VI on SG with total reward and mean payoff, yielding the first anytime algorithms in these settings.
arXiv Detail & Related papers (2023-04-19T19:09:55Z) - An Empirical Study of Pre-trained Language Models in Simple Knowledge
Graph Question Answering [28.31377197194905]
Large-scale pre-trained language models (PLMs) have recently achieved great success and become a milestone in natural language processing (NLP)
In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models.
We compare the performance of different PLMs in KGQA and present three benchmarks for larger-scale KGs.
arXiv Detail & Related papers (2023-03-18T08:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.