Related papers: Real World Conversational Entity Linking Requires More Than Zeroshots

Real World Conversational Entity Linking Requires More Than Zeroshots

URL: http://arxiv.org/abs/2409.01152v1
Date: Mon, 2 Sep 2024 10:37:53 GMT
Title: Real World Conversational Entity Linking Requires More Than Zeroshots
Authors: Mohanna Hoveyda, Arjen P. de Vries, Maarten de Rijke, Faegheh Hasibi,
Abstract summary: We design targeted evaluation scenarios to measure the efficacy of EL models under resource constraints. We assess EL models' ability to generalize to a new unfamiliar KB using Fandom and a novel zero-shot conversational entity linking dataset. Results indicate that current zero-shot EL models falter when introduced to new, domain-specific KBs without prior training.
Score: 50.5691094768954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Entity linking (EL) in conversations faces notable challenges in practical applications, primarily due to the scarcity of entity-annotated conversational datasets and sparse knowledge bases (KB) containing domain-specific, long-tail entities. We designed targeted evaluation scenarios to measure the efficacy of EL models under resource constraints. Our evaluation employs two KBs: Fandom, exemplifying real-world EL complexities, and the widely used Wikipedia. First, we assess EL models' ability to generalize to a new unfamiliar KB using Fandom and a novel zero-shot conversational entity linking dataset that we curated based on Reddit discussions on Fandom entities. We then evaluate the adaptability of EL models to conversational settings without prior training. Our results indicate that current zero-shot EL models falter when introduced to new, domain-specific KBs without prior training, significantly dropping in performance. Our findings reveal that previous evaluation approaches fall short of capturing real-world complexities for zero-shot EL, highlighting the necessity for new approaches to design and assess conversational EL models to adapt to limited resources. The evaluation setup and the dataset proposed in this research are made publicly available.

Related papers

Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets [2.5999037208435705]
Link Prediction models that incorporate numerical literals have shown minor improvements on existing benchmark datasets. It is unclear whether a model is actually better in using numerical literals, or better capable of utilizing the graph structure. We propose a methodology to evaluate LP models that incorporate numerical literals.
arXiv Detail & Related papers (2024-07-25T17:55:33Z)
ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation [38.64175351885443]
Large language models have been flourishing in the natural language processing (NLP) domain. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns. Existing works only fine-tune a sole LLM on given text data without introducing that important information to it.
arXiv Detail & Related papers (2024-06-27T01:37:57Z)
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z)
VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs. Existing benchmarks are often limited in scope, focusing mainly on object hallucinations. We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z)
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications [10.529898520273063]
ACLSum is a novel summarization dataset carefully crafted and evaluated by domain experts. In contrast to previous datasets, ACLSum facilitates multi-aspect summarization of scientific papers.
arXiv Detail & Related papers (2024-03-08T13:32:01Z)
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset [37.02104430195374]
Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. We benchmark the CSKB population task with a new large-scale dataset. We also propose a novel inductive commonsense reasoning model that reasons over graphs.
arXiv Detail & Related papers (2021-09-16T02:50:01Z)
Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem. In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs) Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z)
CorDEL: A Contrastive Deep Learning Approach for Entity Linkage [70.82533554253335]
Entity linkage (EL) is a critical problem in data cleaning and integration. With the ever-increasing growth of new data, deep learning (DL) based approaches have been proposed to alleviate the high cost of EL associated with the traditional models. We argue that the twin-network architecture is sub-optimal to EL, leading to inherent drawbacks of existing models.
arXiv Detail & Related papers (2020-09-15T16:33:05Z)
Novel Human-Object Interaction Detection via Adversarial Domain Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.