SenSeNet: Neural Keyphrase Generation with Document Structure
- URL: http://arxiv.org/abs/2012.06754v1
- Date: Sat, 12 Dec 2020 08:21:08 GMT
- Title: SenSeNet: Neural Keyphrase Generation with Document Structure
- Authors: Yichao Luo, Zhengyan Li, Bingning Wang, Xiaoyu Xing, Qi Zhang,
Xuanjing Huang
- Abstract summary: We propose a new method called Sentence Selective Network (SenSeNet) to incorporate the meta-sentence inductive bias into Keyphrase Generation (KG)
SenSeNet can consistently improve the performance of major KG models based on seq2seq framework.
- Score: 42.641790028836795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keyphrase Generation (KG) is the task of generating central topics from a
given document or literary work, which captures the crucial information
necessary to understand the content. Documents such as scientific literature
contain rich meta-sentence information, which represents the logical-semantic
structure of the documents. However, previous approaches ignore the constraints
of document logical structure, and hence they mistakenly generate keyphrases
from unimportant sentences. To address this problem, we propose a new method
called Sentence Selective Network (SenSeNet) to incorporate the meta-sentence
inductive bias into KG. In SenSeNet, we use a straight-through estimator for
end-to-end training and incorporate weak supervision in the training of the
sentence selection module. Experimental results show that SenSeNet can
consistently improve the performance of major KG models based on seq2seq
framework, which demonstrate the effectiveness of capturing structural
information and distinguishing the significance of sentences in KG task.
Related papers
- SimCKP: Simple Contrastive Learning of Keyphrase Representations [36.88517357720033]
We propose SimCKP, a simple contrastive learning framework that consists of two stages: 1) An extractor-generator that extracts keyphrases by learning context-aware phrase-level representations in a contrastive manner while also generating keyphrases that do not appear in the document; and 2) A reranker that adapts scores for each generated phrase by likewise aligning their representations with the corresponding document.
arXiv Detail & Related papers (2023-10-12T11:11:54Z) - Large Language Model Prompt Chaining for Long Legal Document
Classification [2.3148470932285665]
Chaining is a strategy used to decompose complex tasks into smaller, manageable components.
We demonstrate that through prompt chaining, we can not only enhance the performance over zero-shot, but also surpass the micro-F1 score achieved by larger models.
arXiv Detail & Related papers (2023-08-08T08:57:01Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval.
It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding.
We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z) - Syntax Controlled Knowledge Graph-to-Text Generation with Order and
Semantic Consistency [10.7334441041015]
Knowledge graph-to-text (KG-to-text) generation aims to generate easy-to-understand sentences from the knowledge graph.
In this paper, we optimize the knowledge description order prediction under the order supervision extracted from the caption.
We incorporate the Part-of-Speech (POS) syntactic tags to constrain the positions to copy words from the KG.
arXiv Detail & Related papers (2022-07-02T02:42:14Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Keyphrase Generation with Cross-Document Attention [28.565813544820553]
Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document.
We propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention.
We also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases.
arXiv Detail & Related papers (2020-04-21T07:58:27Z) - Selective Attention Encoders by Syntactic Graph Convolutional Networks
for Document Summarization [21.351111598564987]
We propose a graph to connect the parsing trees from the sentences in a document and utilize the stacked graph convolutional networks (GCNs) to learn the syntactic representation for a document.
The proposed GCNs based selective attention approach outperforms the baselines and achieves the state-of-the-art performance on the dataset.
arXiv Detail & Related papers (2020-03-18T01:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.