Evaluation of LLMs on Long-tail Entity Linking in Historical Documents
- URL: http://arxiv.org/abs/2505.03473v1
- Date: Tue, 06 May 2025 12:25:15 GMT
- Title: Evaluation of LLMs on Long-tail Entity Linking in Historical Documents
- Authors: Marta Boscariol, Luana Bulla, Lia Draetta, Beatrice Fiumanò, Emanuele Lenzi, Leonardo Piano,
- Abstract summary: We assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario.<n>Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific historical texts, we quantitatively compare the performance of LLMs in identifying and linking entities to their corresponding Wikidata entries.<n>Our preliminary experiments reveal that LLMs perform encouragingly well in long-tail EL, indicating that this technology can be a valuable adjunct in filling the gap between head and long-tail EL.
- Score: 1.9854418074386933
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entity Linking (EL) plays a crucial role in Natural Language Processing (NLP) applications, enabling the disambiguation of entity mentions by linking them to their corresponding entries in a reference knowledge base (KB). Thanks to their deep contextual understanding capabilities, LLMs offer a new perspective to tackle EL, promising better results than traditional methods. Despite the impressive generalization capabilities of LLMs, linking less popular, long-tail entities remains challenging as these entities are often underrepresented in training data and knowledge bases. Furthermore, the long-tail EL task is an understudied problem, and limited studies address it with LLMs. In the present work, we assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario. Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific historical texts, we quantitatively compare the performance of LLMs in identifying and linking entities to their corresponding Wikidata entries against that of ReLiK, a state-of-the-art Entity Linking and Relation Extraction framework. Our preliminary experiments reveal that LLMs perform encouragingly well in long-tail EL, indicating that this technology can be a valuable adjunct in filling the gap between head and long-tail EL.
Related papers
- Harnessing Deep LLM Participation for Robust Entity Linking [14.079957943961276]
We introduce DeepEL, a comprehensive framework that incorporates Large Language Models (LLMs) into every stage of the entity linking task.<n>To address this limitation, we propose a novel self-validation mechanism that utilizes global contextual information.<n>Extensive empirical evaluation across ten benchmark datasets demonstrates that DeepEL substantially outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2025-11-18T06:35:26Z) - LLM-Specific Utility: A New Perspective for Retrieval-Augmented Generation [110.610512800947]
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating external knowledge.<n>Existing studies often treat utility as a generic attribute, ignoring the fact that different LLMs may benefit differently from the same passage.
arXiv Detail & Related papers (2025-10-13T12:57:45Z) - Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling [69.84963245729826]
We propose an auxiliary task of QL to enhance the backbone for subsequent contrastive learning of the retriever.<n>We introduce our model, which incorporates two key components: Attention Block (AB) and Document Corruption (DC)
arXiv Detail & Related papers (2025-04-07T16:03:59Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation [50.375567142250446]
Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation.<n>We propose a novel framework, LLM-Forest, which introduces a "forest" of few-shot prompt learning LLM "trees" with their outputs aggregated via confidence-based weighted voting.<n>This framework is established on a new concept of bipartite information graphs to identify high-quality relevant neighboring entries with both feature and value granularity.
arXiv Detail & Related papers (2024-10-28T20:42:46Z) - LLM with Relation Classifier for Document-Level Relation Extraction [25.587850398830252]
Large language models (LLMs) have created a new paradigm for natural language processing.<n>This paper investigates the causes of this performance gap, identifying the dispersion of attention by LLMs due to entity pairs without relations as a key factor.
arXiv Detail & Related papers (2024-08-25T16:43:19Z) - Are LLMs Good Annotators for Discourse-level Event Relation Extraction? [15.365993658296016]
We assess the effectiveness of Large Language Models (LLMs) in addressing discourse-level event relation extraction tasks.<n> Evaluation is conducted using an commercial model, GPT-3.5, and an open-source model, LLaMA-2.
arXiv Detail & Related papers (2024-07-28T19:27:06Z) - Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs [17.109522466982476]
Large Language Models (LLMs) have been shown to perform well for many downstream tasks.<n>This paper investigates how well LLMs perform in the task of forecasting corporate credit ratings.
arXiv Detail & Related papers (2024-07-24T20:30:55Z) - LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking [35.393279375085854]
Large language models (LLMs) are more robust at interpreting uncommon mentions.
We introduce LLM-Augmented Entity Linking LLMAEL, a plug-and-play approach to enhance entity linking.
Experiments on 6 standard datasets show that the vanilla LLMAEL outperforms baseline EL models in most cases.
arXiv Detail & Related papers (2024-07-04T15:55:13Z) - Is In-Context Learning Sufficient for Instruction Following in LLMs? [38.29072578390376]
We show that, while effective, ICL alignment withAL still underperforms compared to instruction fine-tuning on the established benchmark MT-Bench.<n>We provide the first, to our knowledge, systematic comparison of ICL and instruction fine-tuning (IFT) for instruction following in the low data regime.
arXiv Detail & Related papers (2024-05-30T09:28:56Z) - Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts [50.06633829833144]
Large Language Models (LLMs) are effective in performing various NLP tasks, but struggle to handle tasks that require extensive, real-world knowledge.
We propose a benchmark that requires knowledge of long-tail facts for answering the involved questions.
Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required.
arXiv Detail & Related papers (2024-05-10T15:10:20Z) - DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMs [1.7764955091415962]
We present an approach to dynamically evaluate the knowledge in LLMs and their time-sensitiveness against Wikidata.
We evaluate the time-sensitive knowledge in twenty-four private and open-source LLMs, as well as the effectiveness of four editing methods in updating the outdated facts.
Our results show that 1) outdatedness is a critical problem across state-of-the-art LLMs; 2) LLMs output inconsistent answers when prompted with slight variations of the question prompt; and 3) the performance of the state-of-the-art knowledge editing algorithms is very limited.
arXiv Detail & Related papers (2024-04-10T18:08:59Z) - Found in the Middle: How Language Models Use Long Contexts Better via
Plug-and-Play Positional Encoding [78.36702055076456]
This paper introduces Multi-scale Positional.
(Ms-PoE) which is a simple yet effective plug-and-play approach to enhance the capacity of.
LLMs to handle relevant information located in the middle of the context.
arXiv Detail & Related papers (2024-03-05T04:58:37Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search [67.35240346713911]
We take the first step towards evaluating large language models (LLMs) in the long-tail distribution of inferential knowledge.
Link is a systematic long-tail data generation framework, to obtain factually-correct yet long-tail inferential statements.
We then use LINK to curate Logic-Induced-Long-Tail (LINT), a large-scale long-tail inferential knowledge dataset.
arXiv Detail & Related papers (2023-11-13T10:56:59Z) - LooGLE: Can Long-Context Language Models Understand Long Contexts? [46.143956498529796]
LooGLE is a benchmark for large language models' long context understanding.
It features relatively new documents post-2022, with over 24,000 tokens per document and 6,000 newly generated questions spanning diverse domains.
The evaluation of eight state-of-the-art LLMs on LooGLE revealed key findings.
arXiv Detail & Related papers (2023-11-08T01:45:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.