On Leveraging Encoder-only Pre-trained Language Models for Effective
Keyphrase Generation
- URL: http://arxiv.org/abs/2402.14052v1
- Date: Wed, 21 Feb 2024 18:57:54 GMT
- Title: On Leveraging Encoder-only Pre-trained Language Models for Effective
Keyphrase Generation
- Authors: Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang
- Abstract summary: This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG)
With encoder-only PLMs, although KPE with Conditional Random Fields slightly excels in identifying present keyphrases, the KPG formulation renders a broader spectrum of keyphrase predictions.
We also identify a favorable parameter allocation towards model depth rather than width when employing encoder-decoder architectures with encoder-only PLMs.
- Score: 76.52997424694767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study addresses the application of encoder-only Pre-trained Language
Models (PLMs) in keyphrase generation (KPG) amidst the broader availability of
domain-tailored encoder-only models compared to encoder-decoder models. We
investigate three core inquiries: (1) the efficacy of encoder-only PLMs in KPG,
(2) optimal architectural decisions for employing encoder-only PLMs in KPG, and
(3) a performance comparison between in-domain encoder-only and encoder-decoder
PLMs across varied resource settings. Our findings, derived from extensive
experimentation in two domains reveal that with encoder-only PLMs, although KPE
with Conditional Random Fields slightly excels in identifying present
keyphrases, the KPG formulation renders a broader spectrum of keyphrase
predictions. Additionally, prefix-LM fine-tuning of encoder-only PLMs emerges
as a strong and data-efficient strategy for KPG, outperforming general-domain
seq2seq PLMs. We also identify a favorable parameter allocation towards model
depth rather than width when employing encoder-decoder architectures
initialized with encoder-only PLMs. The study sheds light on the potential of
utilizing encoder-only PLMs for advancing KPG systems and provides a groundwork
for future KPG methods. Our code and pre-trained checkpoints are released at
https://github.com/uclanlp/DeepKPG.
Related papers
- Are Decoder-Only Large Language Models the Silver Bullet for Code Search? [32.338318300589776]
This study presents the first systematic exploration of decoder-only large language models for code search.
We evaluate nine state-of-the-art decoder-only models using two fine-tuning methods, two datasets, and three model sizes.
Our findings reveal that fine-tuned CodeGemma significantly outperforms encoder-only models like UniXcoder.
arXiv Detail & Related papers (2024-10-29T17:05:25Z) - How to get better embeddings with code pre-trained models? An empirical
study [6.220333404184779]
We study five different code pre-trained models (PTMs) to generate embeddings for downstream classification tasks.
We find that embeddings obtained through special tokens do not sufficiently aggregate the semantic information of the entire code snippet.
The quality of code embeddings obtained by combing code data and text data in the same way as pre-training the PTMs is poor and cannot guarantee richer semantic information.
arXiv Detail & Related papers (2023-11-14T10:44:21Z) - Rethinking Model Selection and Decoding for Keyphrase Generation with
Pre-trained Sequence-to-Sequence Models [76.52997424694767]
Keyphrase Generation (KPG) is a longstanding task in NLP with widespread applications.
Seq2seq pre-trained language models (PLMs) have ushered in a transformative era for KPG, yielding promising performance improvements.
This paper undertakes a systematic analysis of the influence of model selection and decoding strategies on PLM-based KPG.
arXiv Detail & Related papers (2023-10-10T07:34:45Z) - Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
Regularized Encoder-Decoder [75.03283861464365]
The seq2seq task aims at generating the target sequence based on the given input source sequence.
Traditionally, most of the seq2seq task is resolved by an encoder to encode the source sequence and a decoder to generate the target text.
Recently, a bunch of new approaches have emerged that apply decoder-only language models directly to the seq2seq task.
arXiv Detail & Related papers (2023-04-08T15:44:29Z) - Machine Learning-Aided Efficient Decoding of Reed-Muller Subcodes [59.55193427277134]
Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels.
RM codes only admit limited sets of rates.
Efficient decoders are available for RM codes at finite lengths.
arXiv Detail & Related papers (2023-01-16T04:11:14Z) - Pre-trained Language Models for Keyphrase Generation: A Thorough
Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models.
We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance.
Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z) - CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family [45.74928228858547]
We introduce a novel $textbfC$urtextbfRI$culum based $textbfS$equential neural decoder for $textbfP$olar codes (CRISP)
We show that CRISP attains near-optimal reliability performance on the Polar(32,16) and Polar(64,22) codes.
CRISP can be readily extended to Polarization-Adjusted-Convolutional (PAC) codes, where existing SC decoders are significantly less reliable.
arXiv Detail & Related papers (2022-10-01T16:26:24Z) - Non-autoregressive End-to-end Speech Translation with Parallel
Autoregressive Rescoring [83.32560748324667]
This article describes an efficient end-to-end speech translation (E2E-ST) framework based on non-autoregressive (NAR) models.
We propose a unified NAR E2E-ST framework called Orthros, which has an NAR decoder and an auxiliary shallow AR decoder on top of the shared encoder.
arXiv Detail & Related papers (2021-09-09T16:50:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.