Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness
- URL: http://arxiv.org/abs/2103.12440v1
- Date: Tue, 23 Mar 2021 10:42:18 GMT
- Title: Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness
- Authors: Florian Boudin and Ygor Gallina
- Abstract summary: We discuss the usefulness of absent keyphrases from an Information Retrieval perspective.
We introduce a finer-grained categorization scheme that sheds more light on the impact of absent keyphrases on scientific document retrieval.
- Score: 9.13755431537592
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Neural keyphrase generation models have recently attracted much interest due
to their ability to output absent keyphrases, that is, keyphrases that do not
appear in the source text. In this paper, we discuss the usefulness of absent
keyphrases from an Information Retrieval (IR) perspective, and show that the
commonly drawn distinction between present and absent keyphrases is not made
explicit enough. We introduce a finer-grained categorization scheme that sheds
more light on the impact of absent keyphrases on scientific document retrieval.
Under this scheme, we find that only a fraction (around 20%) of the words that
make up keyphrases actually serves as document expansion, but that this small
fraction of words is behind much of the gains observed in retrieval
effectiveness. We also discuss how the proposed scheme can offer a new angle to
evaluate the output of neural keyphrase generation models.
Related papers
- Enhancing Keyphrase Generation by BART Finetuning with Splitting and
Shuffling [6.276370570422467]
Keyphrase generation is a task of identifying a set of phrases that best repre-sent the main topics or themes of a text.
We propose Keyphrase-Focused BART, which exploits the differ-ences between present and absent keyphrase generations.
For absent keyphrases, our Keyphrase-Focused BART achieved new state-of-the-art score on F1@5 in two out of five keyphrase gen-eration benchmark datasets.
arXiv Detail & Related papers (2023-09-13T05:02:11Z) - To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive
Refinement [58.96644066571205]
We show that existing deep keyword spotting mechanisms can be improved by Successive Refinement.
We show across multiple models with size ranging from 13K parameters to 2.41M parameters, the successive refinement technique reduces FA by up to a factor of 8.
Our proposed approach is "plug-and-play" and can be applied to any deep keyword spotting model.
arXiv Detail & Related papers (2023-04-06T23:49:29Z) - Unsupervised Syntactically Controlled Paraphrase Generation with
Abstract Meaning Representations [59.10748929158525]
Abstract Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation.
Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), encodes the AMR graph and the constituency parses the input sentence into two disentangled semantic and syntactic embeddings.
Experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches.
arXiv Detail & Related papers (2022-11-02T04:58:38Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - Applying Transformer-based Text Summarization for Keyphrase Generation [2.28438857884398]
Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
arXiv Detail & Related papers (2022-09-08T13:01:52Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Unsupervised Keyphrase Extraction via Interpretable Neural Networks [27.774524511005172]
Keyphrases that are most useful for predicting the topic of a text are important keyphrases.
InSPECT is a self-explaining neural framework for identifying influential keyphrases.
We show that INSPECT achieves state-of-the-art results in unsupervised key extraction across four diverse datasets.
arXiv Detail & Related papers (2022-03-15T04:30:47Z) - KPDrop: An Approach to Improving Absent Keyphrase Generation [26.563045686728135]
Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document.
We propose an approach, called keyphrase dropout (or KPDrop) to improve absent keyphrase generation.
arXiv Detail & Related papers (2021-12-02T18:25:56Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Unsupervised Deep Keyphrase Generation [14.544869226959612]
Keyphrase generation aims to summarize long documents with a collection of salient phrases.
Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document.
We present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any human annotation.
arXiv Detail & Related papers (2021-04-18T05:53:19Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.