W2KPE: Keyphrase Extraction with Word-Word Relation
- URL: http://arxiv.org/abs/2303.13463v1
- Date: Wed, 22 Mar 2023 15:32:40 GMT
- Title: W2KPE: Keyphrase Extraction with Word-Word Relation
- Authors: Wen Cheng, Shichen Dong, Wei Wang
- Abstract summary: We model the challenge as a single-class Named Entity Recognition task.
For the data preprocessing, we encode the split keyphrases after word segmentation.
We replace the loss function with the multi-class focal loss to address the sparseness of keyphrases.
- Score: 4.759934907814052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes our submission to ICASSP 2023 MUG Challenge Track 4,
Keyphrase Extraction, which aims to extract keyphrases most relevant to the
conference theme from conference materials. We model the challenge as a
single-class Named Entity Recognition task and developed techniques for better
performance on the challenge: For the data preprocessing, we encode the split
keyphrases after word segmentation. In addition, we increase the amount of
input information that the model can accept at one time by fusing multiple
preprocessed sentences into one segment. We replace the loss function with the
multi-class focal loss to address the sparseness of keyphrases. Besides, we
score each appearance of keyphrases and add an extra output layer to fit the
score to rank keyphrases. Exhaustive evaluations are performed to find the best
combination of the word segmentation tool, the pre-trained embedding model, and
the corresponding hyperparameters. With these proposals, we scored 45.04 on the
final test set.
Related papers
- MetaKP: On-Demand Keyphrase Generation [52.48698290354449]
We introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents.
We present MetaKP, a large-scale benchmark comprising four datasets, 7500 documents, and 3760 goals across news and biomedical domains with human-annotated keyphrases.
We demonstrate the potential of our method to serve as a general NLP infrastructure, exemplified by its application in epidemic event detection from social media.
arXiv Detail & Related papers (2024-06-28T19:02:59Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Enhancing Phrase Representation by Information Bottleneck Guided Text Diffusion Process for Keyphrase Extraction [9.307602861891926]
Keyphrase extraction is an important task in Natural Language Processing.
In this study, we propose Diff-KPE to guide the text diffusion process for generating enhanced keyphrase representations.
Experiments show that Diff-KPE outperforms existing KPE methods on a large open domain keyphrase extraction benchmark, OpenKP, and a scientific domain dataset, KP20K.
arXiv Detail & Related papers (2023-08-17T02:26:30Z) - Neural Keyphrase Generation: Analysis and Evaluation [47.004575377472285]
We study various tendencies exhibited by three strong models: T5 (based on a pre-trained transformer), CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a recurrent neural network)
We propose a novel metric framework, SoftKeyScore, to evaluate the similarity between two sets of keyphrases.
arXiv Detail & Related papers (2023-04-27T00:10:21Z) - Overview of the ICASSP 2023 General Meeting Understanding and Generation
Challenge (MUG) [60.09540662936726]
MUG includes five tracks, including topic segmentation, topic-level and session-level extractive summarization, topic title generation, keyphrase extraction, and action item detection.
To facilitate MUG, we construct and release a large-scale meeting dataset, the AliMeeting4MUG Corpus.
arXiv Detail & Related papers (2023-03-24T11:42:19Z) - PatternRank: Leveraging Pretrained Language Models and Part of Speech
for Unsupervised Keyphrase Extraction [0.6767885381740952]
We present PatternRank, which pretrained language models and part-of-speech for unsupervised keyphrase extraction from single documents.
Our experiments show PatternRank achieves higher precision, recall and F1-scores than previous state-of-the-art approaches.
arXiv Detail & Related papers (2022-10-11T08:23:54Z) - Applying Transformer-based Text Summarization for Keyphrase Generation [2.28438857884398]
Keyphrases are crucial for searching and systematizing scholarly documents.
In this paper, we experiment with popular transformer-based models for abstractive text summarization.
We show that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERT.Score.
We also investigate several ordering strategies to target keyphrases.
arXiv Detail & Related papers (2022-09-08T13:01:52Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.