Neural Entity Summarization with Joint Encoding and Weak Supervision
- URL: http://arxiv.org/abs/2005.00152v2
- Date: Sun, 10 May 2020 08:30:29 GMT
- Title: Neural Entity Summarization with Joint Encoding and Weak Supervision
- Authors: Junyou Li, Gong Cheng, Qingxia Liu, Wen Zhang, Evgeny Kharlamov, Kalpa
Gunaratna, Huajun Chen
- Abstract summary: In knowledge graphs, an entity is often described by a large number of triple facts.
Existing solutions to entitymarization are mainly unsupervised.
We present a supervised approach that is based on our novel neural model.
- Score: 29.26714907483851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a large-scale knowledge graph (KG), an entity is often described by a
large number of triple-structured facts. Many applications require abridged
versions of entity descriptions, called entity summaries. Existing solutions to
entity summarization are mainly unsupervised. In this paper, we present a
supervised approach NEST that is based on our novel neural model to jointly
encode graph structure and text in KGs and generate high-quality diversified
summaries. Since it is costly to obtain manually labeled summaries for
training, our supervision is weak as we train with programmatically labeled
data which may contain noise but is free of manual work. Evaluation results
show that our approach significantly outperforms the state of the art on two
public benchmarks.
Related papers
- GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization [13.61818620609812]
We propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach.
It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts.
Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches.
arXiv Detail & Related papers (2024-08-19T16:01:48Z) - GUMsley: Evaluating Entity Salience in Summarization for 12 English
Genres [14.37990666928991]
We present and evaluate GUMsley, the first entity salience dataset covering all named and non-named salient entities for 12 genres of English text.
We show that predicting or providing salient entities to several model architectures enhances performance and helps derive higher-quality summaries.
arXiv Detail & Related papers (2024-01-31T16:30:50Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - Knowledge-Rich Self-Supervised Entity Linking [58.838404666183656]
Knowledge-RIch Self-Supervision ($tt KRISSBERT$) is a universal entity linker for four million UMLS entities.
Our approach subsumes zero-shot and few-shot methods, and can easily incorporate entity descriptions and gold mention labels if available.
Without using any labeled information, our method produces $tt KRISSBERT$, a universal entity linker for four million UMLS entities.
arXiv Detail & Related papers (2021-12-15T05:05:12Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z) - StructSum: Summarization via Structured Representations [27.890477913486787]
Abstractive text summarization aims at compressing the information of a long source document into a condensed summary.
Despite advances in modeling techniques, abstractive summarization models still suffer from several key challenges.
We propose a framework based on document-level structure induction for summarization to address these challenges.
arXiv Detail & Related papers (2020-03-01T20:32:51Z) - TED: A Pretrained Unsupervised Summarization Model with Theme Modeling
and Denoising [44.384730968526156]
We propose a transformer-based unsupervised abstractive summarization system with pretraining on large-scale data.
We first leverage the lead bias in news articles to pretrain the model on millions of unlabeled corpora.
We finetune TED on target domains through theme modeling and a denoising autoencoder to enhance the quality of generated summaries.
arXiv Detail & Related papers (2020-01-03T05:15:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.