Related papers: On the Representations of Entities in Auto-regressive Large Language Models

On the Representations of Entities in Auto-regressive Large Language Models

URL: http://arxiv.org/abs/2510.09421v1
Date: Fri, 10 Oct 2025 14:23:44 GMT
Title: On the Representations of Entities in Auto-regressive Large Language Models
Authors: Victor Morand, Josiane Mothe, Benjamin Piwowarski,
Abstract summary: We introduce entity mention reconstruction as a novel framework for studying how Large Language Models (LLMs) encode and manipulate entities.<n>We investigate whether entity mentions can be generated from internal representations, how multi-token entities are encoded beyond last-token embeddings, and whether these representations capture relational knowledge.
Score: 10.08529318994409
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Named entities are fundamental building blocks of knowledge in text, grounding factual information and structuring relationships within language. Despite their importance, it remains unclear how Large Language Models (LLMs) internally represent entities. Prior research has primarily examined explicit relationships, but little is known about entity representations themselves. We introduce entity mention reconstruction as a novel framework for studying how LLMs encode and manipulate entities. We investigate whether entity mentions can be generated from internal representations, how multi-token entities are encoded beyond last-token embeddings, and whether these representations capture relational knowledge. Our proposed method, leveraging _task vectors_, allows to consistently generate multi-token mentions from various entity representations derived from the LLMs hidden states. We thus introduce the _Entity Lens_, extending the _logit-lens_ to predict multi-token mentions. Our results bring new evidence that LLMs develop entity-specific mechanisms to represent and manipulate any multi-token entities, including those unseen during training. Our code is avalable at https://github.com/VictorMorand/EntityRepresentations .

Related papers

Large Language Models are Good Relational Learners [55.40941576497973]
We introduce Rel-LLM, a novel architecture that utilizes a graph neural network (GNN)- based encoder to generate structured relational prompts for large language models (LLMs)<n>Unlike traditional text-based serialization approaches, our method preserves the inherent relational structure of databases while enabling LLMs to process and reason over complex entity relationships.
arXiv Detail & Related papers (2025-06-06T04:07:55Z)
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data. We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation. Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z)
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting [49.655711022673046]
OneNet is an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning. OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning.
arXiv Detail & Related papers (2024-10-10T02:45:23Z)
Two Heads Are Better Than One: Integrating Knowledge from Knowledge Graphs and Large Language Models for Entity Alignment [31.70064035432789]
We propose a Large Language Model-enhanced Entity Alignment framework (LLMEA) LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across Knowledge Graphs and edit distances to a virtual equivalent entity. Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models.
arXiv Detail & Related papers (2024-01-30T12:41:04Z)
Dynamic Retrieval-Augmented Generation [4.741884506444161]
We propose a novel approach for the Dynamic Retrieval-Augmented Generation (DRAG) DRAG injects compressed embeddings of the retrieved entities into the generative model. Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.
arXiv Detail & Related papers (2023-12-14T14:26:57Z)
Entity Cloze By Date: What LMs Know About Unseen Entities [79.34707800653597]
Language models (LMs) are typically trained once on a large-scale corpus and used for years without being updated. We propose a framework to analyze what LMs can infer about new entities that did not exist when the LMs were pretrained. We derive a dataset of entities indexed by their origination date and paired with their English Wikipedia articles, from which we can find sentences about each entity.
arXiv Detail & Related papers (2022-05-05T17:59:31Z)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention [37.111204321059084]
We propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. Our model is trained using a new pretraining task based on the masked language model of BERT. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer.
arXiv Detail & Related papers (2020-10-02T15:38:03Z)
Autoregressive Entity Retrieval [55.38027440347138]
Entities are at the center of how we represent and aggregate knowledge. The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. We propose GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion.
arXiv Detail & Related papers (2020-10-02T10:13:31Z)
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE) We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets. We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.