Deep Keyphrase Completion
- URL: http://arxiv.org/abs/2111.01910v1
- Date: Fri, 29 Oct 2021 07:15:35 GMT
- Title: Deep Keyphrase Completion
- Authors: Yu Zhao, Jia Song, Huali Feng, Fuzhen Zhuang, Qing Li, Xiaojie Wang,
Ji Liu
- Abstract summary: Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
- Score: 59.0413813332449
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrase provides accurate information of document content that is highly
compact, concise, full of meanings, and widely used for discourse
comprehension, organization, and text retrieval. Though previous studies have
made substantial efforts for automated keyphrase extraction and generation,
surprisingly, few studies have been made for \textit{keyphrase completion}
(KPC). KPC aims to generate more keyphrases for document (e.g. scientific
publication) taking advantage of document content along with a very limited
number of known keyphrases, which can be applied to improve text indexing
system, etc. In this paper, we propose a novel KPC method with an
encoder-decoder framework. We name it \textit{deep keyphrase completion} (DKPC)
since it attempts to capture the deep semantic meaning of the document content
together with known keyphrases via a deep learning framework. Specifically, the
encoder and the decoder in DKPC play different roles to make full use of the
known keyphrases. The former considers the keyphrase-guiding factors, which
aggregates information of known keyphrases into context. On the contrary, the
latter considers the keyphrase-inhibited factor to inhibit semantically
repeated keyphrase generation. Extensive experiments on benchmark datasets
demonstrate the efficacy of our proposed model.
Related papers
- SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Enhancing Phrase Representation by Information Bottleneck Guided Text Diffusion Process for Keyphrase Extraction [9.307602861891926]
Keyphrase extraction is an important task in Natural Language Processing.
In this study, we propose Diff-KPE to guide the text diffusion process for generating enhanced keyphrase representations.
Experiments show that Diff-KPE outperforms existing KPE methods on a large open domain keyphrase extraction benchmark, OpenKP, and a scientific domain dataset, KP20K.
arXiv Detail & Related papers (2023-08-17T02:26:30Z) - Neural Keyphrase Generation: Analysis and Evaluation [47.004575377472285]
We study various tendencies exhibited by three strong models: T5 (based on a pre-trained transformer), CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a recurrent neural network)
We propose a novel metric framework, SoftKeyScore, to evaluate the similarity between two sets of keyphrases.
arXiv Detail & Related papers (2023-04-27T00:10:21Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - LDKP: A Dataset for Identifying Keyphrases from Long Scientific
Documents [48.84086818702328]
Identifying keyphrases (KPs) from text documents is a fundamental task in natural language processing and information retrieval.
Vast majority of the benchmark datasets for this task are from the scientific domain containing only the document title and abstract information.
This presents three challenges for real-world applications: human-written summaries are unavailable for most documents, the documents are almost always long, and a high percentage of KPs are directly found beyond the limited context of title and abstract.
arXiv Detail & Related papers (2022-03-29T08:44:57Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Capturing Global Informativeness in Open Domain Keyphrase Extraction [40.57116173502994]
Open-domain KeyPhrase Extraction (KPE) aims to extract keyphrases from documents without domain or quality restrictions.
This paper presents JointKPE, an open-domain KPE architecture built on pre-trained language models.
JointKPE learns to rank keyphrases by estimating their informativeness in the entire document and is jointly trained on the keyphrase chunking task.
arXiv Detail & Related papers (2020-04-28T16:34:35Z) - Keyphrase Generation with Cross-Document Attention [28.565813544820553]
Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document.
We propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention.
We also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases.
arXiv Detail & Related papers (2020-04-21T07:58:27Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.