Related papers: Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement

Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement

URL: http://arxiv.org/abs/2504.15630v1
Date: Tue, 22 Apr 2025 06:42:22 GMT
Title: Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement
Authors: Xiaowei Yuan, Zhao Yang, Ziyang Huang, Yequan Wang, Siqi Fan, Yiming Ju, Jun Zhao, Kang Liu,
Abstract summary: We propose Context-aware Layer Enhancement (CaLE) to enhance the utilization of contextual knowledge in large language models.<n>CaLE strategically amplifies the growth of contextual information at an optimal layer, thereby enriching representations in the final layer.<n>Our experiments demonstrate that CaLE effectively improves context-faithful generation in Question-Answering tasks.
Score: 20.183957585014042
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks, yet they often struggle with context-faithfulness generations that properly reflect contextual knowledge. While existing approaches focus on enhancing the decoding strategies, they ignore the fundamental mechanism of how contextual information is processed within LLMs' internal states. As a result, LLMs remain limited in their ability to fully leverage contextual knowledge. In this paper, we propose Context-aware Layer Enhancement (CaLE), a novel intervention method that enhances the utilization of contextual knowledge within LLMs' internal representations. By employing V-usable information analysis, CaLE strategically amplifies the growth of contextual information at an optimal layer, thereby enriching representations in the final layer. Our experiments demonstrate that CaLE effectively improves context-faithful generation in Question-Answering tasks, particularly in scenarios involving unknown or conflicting contextual knowledge.

Related papers

LLM Inference Enhanced by External Knowledge: A Survey [16.319049759753106]
This study explores strategies for using external knowledge to enhance large language models (LLMs)<n>Our comparative analysis highlights the trade-offs among interpretability, scalability, and performance.
arXiv Detail & Related papers (2025-05-30T09:08:51Z)
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z)
Effective LLM Knowledge Learning via Model Generalization [73.16975077770765]
Large language models (LLMs) are trained on enormous documents that contain extensive world knowledge.<n>It is still not well-understood how knowledge is acquired via autoregressive pre-training.<n>In this paper, we focus on understanding and improving LLM knowledge learning.
arXiv Detail & Related papers (2025-03-05T17:56:20Z)
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension [14.039653386385519]
Large language models (LLMs) acquire, retain, and apply knowledge.<n>This paper introduces a novel framework, K-(CSA)2, which categorizes LLM knowledge along two dimensions: correctness and confidence.
arXiv Detail & Related papers (2025-01-02T16:34:10Z)
KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning [74.21524111840652]
This paper proposes textbfKaLM, a textitKnowledge-aligned Language Modeling approach.<n>It fine-tunes autoregressive large language models to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment.<n> Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks.
arXiv Detail & Related papers (2024-12-06T11:08:24Z)
Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding [11.5386284281652]
We introduce a novel approach that re-imagines information retrieval through dynamic in-context editing. By treating lengthy contexts as malleable external knowledge, our method interactively gathers and integrates relevant information. Experimental results demonstrate that our method effectively empowers context-limited LLMs to engage in multi-hop reasoning with improved performance.
arXiv Detail & Related papers (2024-06-18T06:54:28Z)
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge. We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study [27.23388511249688]
This paper investigates the layer-wise capability of large language models to encode knowledge. We leverage the powerful generative capability of ChatGPT to construct probing datasets. Experiments on conflicting and newly acquired knowledge show that LLMs prefer to encode more context knowledge in the upper layers.
arXiv Detail & Related papers (2024-02-25T11:15:42Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well. Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries. We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z)
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding [27.4842322089676]
KALM is a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts. It achieves state-of-the-art performance on six long document understanding tasks and datasets.
arXiv Detail & Related papers (2022-10-08T20:51:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.