Hyperbolic Relevance Matching for Neural Keyphrase Extraction
- URL: http://arxiv.org/abs/2205.02047v2
- Date: Thu, 21 Dec 2023 11:30:54 GMT
- Title: Hyperbolic Relevance Matching for Neural Keyphrase Extraction
- Authors: Mingyang Song, Yi Feng and Liping Jing
- Abstract summary: Keyphrase extraction is a fundamental task in natural language processing and information retrieval.
We design a new hyperbolic matching model (HyperMatch) to represent phrases and documents in the same hyperbolic space.
- Score: 34.41878064501316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keyphrase extraction is a fundamental task in natural language processing and
information retrieval that aims to extract a set of phrases with important
information from a source document. Identifying important keyphrase is the
central component of the keyphrase extraction task, and its main challenge is
how to represent information comprehensively and discriminate importance
accurately. In this paper, to address these issues, we design a new hyperbolic
matching model (HyperMatch) to represent phrases and documents in the same
hyperbolic space and explicitly estimate the phrase-document relevance via the
Poincar\'e distance as the important score of each phrase. Specifically, to
capture the hierarchical syntactic and semantic structure information,
HyperMatch takes advantage of the hidden representations in multiple layers of
RoBERTa and integrates them as the word embeddings via an adaptive mixing
layer. Meanwhile, considering the hierarchical structure hidden in the
document, HyperMatch embeds both phrases and documents in the same hyperbolic
space via a hyperbolic phrase encoder and a hyperbolic document encoder. This
strategy can further enhance the estimation of phrase-document relevance due to
the good properties of hyperbolic space. In this setting, the keyphrase
extraction can be taken as a matching problem and effectively implemented by
minimizing a hyperbolic margin-based triplet loss. Extensive experiments are
conducted on six benchmarks and demonstrate that HyperMatch outperforms the
state-of-the-art baselines.
Related papers
- Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - BibRank: Automatic Keyphrase Extraction Platform Using~Metadata [0.0]
This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms.
The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing word in Bib format.
arXiv Detail & Related papers (2023-10-13T14:44:34Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Unsupervised Deep Keyphrase Generation [14.544869226959612]
Keyphrase generation aims to summarize long documents with a collection of salient phrases.
Deep neural models have demonstrated a remarkable success in this task, capable of predicting keyphrases that are even absent from a document.
We present a novel method for keyphrase generation, AutoKeyGen, without the supervision of any human annotation.
arXiv Detail & Related papers (2021-04-18T05:53:19Z) - Match-Ignition: Plugging PageRank into Transformer for Long-form Text
Matching [66.71886789848472]
We propose a novel hierarchical noise filtering model, namely Match-Ignition, to tackle the effectiveness and efficiency problem.
The basic idea is to plug the well-known PageRank algorithm into the Transformer, to identify and filter both sentence and word level noisy information.
Noisy sentences are usually easy to detect because the sentence is the basic unit of a long-form text, so we directly use PageRank to filter such information.
arXiv Detail & Related papers (2021-01-16T10:34:03Z) - Keyphrase Generation with Cross-Document Attention [28.565813544820553]
Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document.
We propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention.
We also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases.
arXiv Detail & Related papers (2020-04-21T07:58:27Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z) - Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents.
Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks.
In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.