Keyphrase Generation with Cross-Document Attention
- URL: http://arxiv.org/abs/2004.09800v1
- Date: Tue, 21 Apr 2020 07:58:27 GMT
- Title: Keyphrase Generation with Cross-Document Attention
- Authors: Shizhe Diao, Yan Song, Tong Zhang
- Abstract summary: Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document.
We propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention.
We also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases.
- Score: 28.565813544820553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keyphrase generation aims to produce a set of phrases summarizing the
essentials of a given document. Conventional methods normally apply an
encoder-decoder architecture to generate the output keyphrases for an input
document, where they are designed to focus on each current document so they
inevitably omit crucial corpus-level information carried by other similar
documents, i.e., the cross-document dependency and latent topics. In this
paper, we propose CDKGen, a Transformer-based keyphrase generator, which
expands the Transformer to global attention with cross-document attention
networks to incorporate available documents as references so as to generate
better keyphrases with the guidance of topic information. On top of the
proposed Transformer + cross-document attention architecture, we also adopt a
copy mechanism to enhance our model via selecting appropriate words from
documents to deal with out-of-vocabulary words in keyphrases. Experiment
results on five benchmark datasets illustrate the validity and effectiveness of
our model, which achieves the state-of-the-art performance on all datasets.
Further analyses confirm that the proposed model is able to generate keyphrases
consistent with references while keeping sufficient diversity. The code of
CDKGen is available at https://github.com/SVAIGBA/CDKGen.
Related papers
- Self-Compositional Data Augmentation for Scientific Keyphrase Generation [28.912937922090038]
We present a self-compositional data augmentation method for keyphrase generation.
We measure the relatedness of training documents based on their shared keyphrases, and combine similar documents to generate synthetic samples.
arXiv Detail & Related papers (2024-11-05T12:22:51Z) - Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Heterogeneous Graph Neural Networks for Keyphrase Generation [13.841525616800908]
We propose a novel graph-based method that can capture explicit knowledge from related references.
Our model first retrieves some document-keyphrases pairs similar to the source document from a pre-defined index as references.
To guide the decoding process, a hierarchical attention and copy mechanism is introduced, which directly copies appropriate words from both the source document and its references.
arXiv Detail & Related papers (2021-09-10T07:17:07Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Exclusive Hierarchical Decoding for Deep Keyphrase Generation [63.357895318562214]
Keyphrase generation (KG) aims to summarize the main ideas of a document into a set of keyphrases.
Previous work in this setting employs a sequential decoding process to generate keyphrases.
We propose an exclusive hierarchical decoding framework that includes a hierarchical decoding process and either a soft or a hard exclusion mechanism.
arXiv Detail & Related papers (2020-04-18T02:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.