Related papers: Hyperbolic Relevance Matching for Neural Keyphrase Extraction

Hyperbolic Relevance Matching for Neural Keyphrase Extraction

URL: http://arxiv.org/abs/2205.02047v2
Date: Thu, 21 Dec 2023 11:30:54 GMT
Title: Hyperbolic Relevance Matching for Neural Keyphrase Extraction
Authors: Mingyang Song, Yi Feng and Liping Jing
Abstract summary: Keyphrase extraction is a fundamental task in natural language processing and information retrieval. We design a new hyperbolic matching model (HyperMatch) to represent phrases and documents in the same hyperbolic space.
Score: 34.41878064501316
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Keyphrase extraction is a fundamental task in natural language processing and information retrieval that aims to extract a set of phrases with important information from a source document. Identifying important keyphrase is the central component of the keyphrase extraction task, and its main challenge is how to represent information comprehensively and discriminate importance accurately. In this paper, to address these issues, we design a new hyperbolic matching model (HyperMatch) to represent phrases and documents in the same hyperbolic space and explicitly estimate the phrase-document relevance via the Poincar\'e distance as the important score of each phrase. Specifically, to capture the hierarchical syntactic and semantic structure information, HyperMatch takes advantage of the hidden representations in multiple layers of RoBERTa and integrates them as the word embeddings via an adaptive mixing layer. Meanwhile, considering the hierarchical structure hidden in the document, HyperMatch embeds both phrases and documents in the same hyperbolic space via a hyperbolic phrase encoder and a hyperbolic document encoder. This strategy can further enhance the estimation of phrase-document relevance due to the good properties of hyperbolic space. In this setting, the keyphrase extraction can be taken as a matching problem and effectively implemented by minimizing a hyperbolic margin-based triplet loss. Extensive experiments are conducted on six benchmarks and demonstrate that HyperMatch outperforms the state-of-the-art baselines.

Related papers

Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings. First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss. Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z)
BibRank: Automatic Keyphrase Extraction Platform Using~Metadata [0.0]
This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing word in Bib format.
arXiv Detail & Related papers (2023-10-13T14:44:34Z)
SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z)
Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval. We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases. We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z)
Towards Document-Level Paraphrase Generation with Sentence Rewriting and Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation. We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence. Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z)
Unsupervised Deep Keyphrase Generation [14.544869226959612]
Keyphrase generation aims to summarize long documents with a collection of salient phrases. Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document. We present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any human annotation.
arXiv Detail & Related papers (2021-04-18T05:53:19Z)
Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching [66.71886789848472]
We propose a novel hierarchical noise filtering model, namely Match-Ignition, to tackle the effectiveness and efficiency problem. The basic idea is to plug the well-known PageRank algorithm into the Transformer, to identify and filter both sentence and word level noisy information. Noisy sentences are usually easy to detect because the sentence is the basic unit of a long-form text, so we directly use PageRank to filter such information.
arXiv Detail & Related papers (2021-01-16T10:34:03Z)
Keyphrase Generation with Cross-Document Attention [28.565813544820553]
Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document. We propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention. We also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases.
arXiv Detail & Related papers (2020-04-21T07:58:27Z)
Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases. Previous work in this setting employs a sequential decoding process to generate keyphrases. We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents. Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.