Wiki Entity Summarization Benchmark
- URL: http://arxiv.org/abs/2406.08435v1
- Date: Wed, 12 Jun 2024 17:22:00 GMT
- Title: Wiki Entity Summarization Benchmark
- Authors: Saeedeh Javadi, Atefeh Moradan, Mohammad Sorkhpar, Klim Zaporojets, Davide Mottin, Ira Assent,
- Abstract summary: Entity summarization aims to compute concise summaries for entities in knowledge graphs.
Existing datasets and benchmarks are often limited to a few hundred entities.
We propose WikES, a comprehensive benchmark comprising of entities, their summaries, and their connections.
- Score: 9.25319552487389
- License:
- Abstract: Entity summarization aims to compute concise summaries for entities in knowledge graphs. Existing datasets and benchmarks are often limited to a few hundred entities and discard graph structure in source knowledge graphs. This limitation is particularly pronounced when it comes to ground-truth summaries, where there exist only a few labeled summaries for evaluation and training. We propose WikES, a comprehensive benchmark comprising of entities, their summaries, and their connections. Additionally, WikES features a dataset generator to test entity summarization algorithms in different areas of the knowledge graph. Importantly, our approach combines graph algorithms and NLP models as well as different data sources such that WikES does not require human annotation, rendering the approach cost-effective and generalizable to multiple domains. Finally, WikES is scalable and capable of capturing the complexities of knowledge graphs in terms of topology and semantics. WikES features existing datasets for comparison. Empirical studies of entity summarization methods confirm the usefulness of our benchmark. Data, code, and models are available at: https://github.com/msorkhpar/wiki-entity-summarization.
Related papers
- Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets [2.5999037208435705]
Link Prediction models that incorporate numerical literals have shown minor improvements on existing benchmark datasets.
It is unclear whether a model is actually better in using numerical literals, or better capable of utilizing the graph structure.
We propose a methodology to evaluate LP models that incorporate numerical literals.
arXiv Detail & Related papers (2024-07-25T17:55:33Z) - TAGLAS: An atlas of text-attributed graph datasets in the era of large graph and language models [25.16561980988102]
TAGLAS is an atlas of text-attributed graph (TAG) datasets and benchmarks.
We collect and integrate more than 23 TAG datasets with domains ranging from citation graphs to molecule graphs.
We provide a standardized, efficient, and simplified way to load all datasets and tasks.
arXiv Detail & Related papers (2024-06-20T19:11:35Z) - Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues.
Our method achieves state-of-the-art performance on several MOT datasets.
For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Modeling Fine-grained Information via Knowledge-aware Hierarchical Graph
for Zero-shot Entity Retrieval [11.533614615010643]
We propose GER to capture more fine-grained information as complementary to sentence embeddings.
We learn the fine-grained information about mention/entity by aggregating information from these knowledge units.
Experimental results on popular benchmarks demonstrate that our proposed GER framework performs better than previous state-of-the-art models.
arXiv Detail & Related papers (2022-11-20T14:37:53Z) - StarGraph: A Coarse-to-Fine Representation Method for Large-Scale
Knowledge Graph [0.6445605125467573]
We propose a method named StarGraph, which gives a novel way to utilize the neighborhood information for large-scale knowledge graphs.
The proposed method achieves the best results on the ogbl-wikikg2 dataset, which validates the effectiveness of it.
arXiv Detail & Related papers (2022-05-27T19:32:45Z) - FactGraph: Evaluating Factuality in Summarization with Semantic Graph
Representations [114.94628499698096]
We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MRs)
MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity.
Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%.
arXiv Detail & Related papers (2022-04-13T16:45:33Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Autoregressive Entity Retrieval [55.38027440347138]
Entities are at the center of how we represent and aggregate knowledge.
The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering.
We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
arXiv Detail & Related papers (2020-10-02T10:13:31Z) - Neural Entity Summarization with Joint Encoding and Weak Supervision [29.26714907483851]
In knowledge graphs, an entity is often described by a large number of triple facts.
Existing solutions to entitymarization are mainly unsupervised.
We present a supervised approach that is based on our novel neural model.
arXiv Detail & Related papers (2020-05-01T00:14:08Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.