Enhancing Keyphrase Extraction from Academic Articles with their
Reference Information
- URL: http://arxiv.org/abs/2111.14106v2
- Date: Tue, 30 Nov 2021 04:43:39 GMT
- Title: Enhancing Keyphrase Extraction from Academic Articles with their
Reference Information
- Authors: Chengzhi Zhang, Lei Zhao, Mengyuan Zhao, Yingyi Zhang
- Abstract summary: Keyphrases that summarize document information highly are helpful for users to quickly obtain and understand documents.
Title information in references also contains author-assigned keyphrases.
Experiments show reference information can increase precision, recall, and F1 of automatic keyphrase extraction.
- Score: 12.769066804715697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the development of Internet technology, the phenomenon of information
overload is becoming more and more obvious. It takes a lot of time for users to
obtain the information they need. However, keyphrases that summarize document
information highly are helpful for users to quickly obtain and understand
documents. For academic resources, most existing studies extract keyphrases
through the title and abstract of papers. We find that title information in
references also contains author-assigned keyphrases. Therefore, this article
uses reference information and applies two typical methods of unsupervised
extraction methods (TF*IDF and TextRank), two representative traditional
supervised learning algorithms (Na\"ive Bayes and Conditional Random Field) and
a supervised deep learning model (BiLSTM-CRF), to analyze the specific
performance of reference information on keyphrase extraction. It is expected to
improve the quality of keyphrase recognition from the perspective of expanding
the source text. The experimental results show that reference information can
increase precision, recall, and F1 of automatic keyphrase extraction to a
certain extent. This indicates the usefulness of reference information on
keyphrase extraction of academic papers and provides a new idea for the
following research on automatic keyphrase extraction.
Related papers
- BibRank: Automatic Keyphrase Extraction Platform Using~Metadata [0.0]
This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms.
The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing word in Bib format.
arXiv Detail & Related papers (2023-10-13T14:44:34Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - LDKP: A Dataset for Identifying Keyphrases from Long Scientific
Documents [48.84086818702328]
Identifying keyphrases (KPs) from text documents is a fundamental task in natural language processing and information retrieval.
Vast majority of the benchmark datasets for this task are from the scientific domain containing only the document title and abstract information.
This presents three challenges for real-world applications: human-written summaries are unavailable for most documents, the documents are almost always long, and a high percentage of KPs are directly found beyond the limited context of title and abstract.
arXiv Detail & Related papers (2022-03-29T08:44:57Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - A Joint Learning Approach based on Self-Distillation for Keyphrase
Extraction from Scientific Documents [29.479331909227998]
Keyphrase extraction is the task of extracting a small set of phrases that best describe a document.
Most existing benchmark datasets for the task typically have limited numbers of annotated documents.
We propose a simple and efficient joint learning approach based on the idea of self-distillation.
arXiv Detail & Related papers (2020-10-22T18:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.