Related papers: Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems

Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems

URL: http://arxiv.org/abs/2601.21742v1
Date: Thu, 29 Jan 2026 13:59:32 GMT
Title: Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems
Authors: Ruiwen Zhou, Maojia Song, Xiaobao Wu, Sitao Cheng, Xunjian Yin, Yuxi Xie, Zhuoqun Hao, Wenyue Hua, Liangming Pan, Soujanya Poria, Min-Yen Kan,
Abstract summary: Individual agents in multi-agent systems often lack robustness, tending to blindly conform to misleading peers.<n>We show this weakness stems from both sycophancy and inadequate ability to evaluate peer reliability.<n>We first formalize the learning problem of history-aware reference, introducing the historical interactions of peers as additional input.<n>We then develop Epistemic Context Learning (ECL), a reasoning framework that conditions predictions on explicitly-built peer profiles from history.
Score: 94.9141394384021
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Individual agents in multi-agent (MA) systems often lack robustness, tending to blindly conform to misleading peers. We show this weakness stems from both sycophancy and inadequate ability to evaluate peer reliability. To address this, we first formalize the learning problem of history-aware reference, introducing the historical interactions of peers as additional input, so that agents can estimate peer reliability and learn from trustworthy peers when uncertain. This shifts the task from evaluating peer reasoning quality to estimating peer reliability based on interaction history. We then develop Epistemic Context Learning (ECL): a reasoning framework that conditions predictions on explicitly-built peer profiles from history. We further optimize ECL by reinforcement learning using auxiliary rewards. Our experiments reveal that our ECL enables small models like Qwen 3-4B to outperform a history-agnostic baseline 8x its size (Qwen 3-30B) by accurately identifying reliable peers. ECL also boosts frontier models to near-perfect (100%) performance. We show that ECL generalizes well to various MA configurations and we find that trust is modeled well by LLMs, revealing a strong correlation in trust modeling accuracy and final answer quality.

Related papers

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval [60.25608870901428]
Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs)<n>We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source robustness.
arXiv Detail & Related papers (2026-03-05T18:42:51Z)
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems [53.52419750390942]
Large language models (LLMs) are used in mission-critical factual domains.<n>LLMs exhibit poor calibration performance due to noisy retrieved contexts.<n>We propose NAACL Rules (Noise-AwAre Confidence CaLibration Rules) to provide a principled foundation for resolving overconfidence under noise.
arXiv Detail & Related papers (2026-01-16T05:38:25Z)
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency [78.91846841708586]
We show that even facts answered with perfect self-consistency can rapidly collapse under mild contextual interference.<n>We propose Neighbor-Consistency Belief (NCB), a structural measure of belief that evaluates response coherence across a conceptual neighborhood.<n>We also present Structure-Aware Training (SAT), which optimize context-invariant belief structure and reduces long-tail knowledge brittleness by approximately 30%.
arXiv Detail & Related papers (2026-01-09T16:23:21Z)
Fact-Checking with Large Language Models via Probabilistic Certainty and Consistency [7.806516365113592]
Large language models (LLMs) are increasingly used in applications requiring factual accuracy.<n>While fact-checking can mitigate these errors, existing methods typically retrieve external evidence indiscriminately.<n>We introduce Probabilistic Certainty and Consistency (PCC), a framework that estimates factual confidence.
arXiv Detail & Related papers (2026-01-05T21:57:41Z)
Revisiting the Reliability of Language Models in Instruction-Following [15.281163913211818]
LLMs have achieved near-ceiling instruction-following accuracy on benchmarks such as IFEval.<n>We study nuance-oriented reliability: whether models exhibit consistent competence across cousin prompts that convey analogous user intents but with subtle nuances.<n>Our findings highlight nuance-oriented reliability as a crucial yet underexplored next step toward more dependable and trustworthy LLM behavior.
arXiv Detail & Related papers (2025-12-15T02:57:55Z)
Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning [29.778703252962092]
Reinforcement Learning (RL) has emerged as a powerful paradigm for advancing Large Language Models (LLMs)<n>We develop a novel test-time reward mechanism that operates without external supervision.
arXiv Detail & Related papers (2025-10-20T07:53:51Z)
Confidence as a Reward: Transforming LLMs into Reward Models [54.98336080630691]
Confidence-as-a-Reward (CRew) is a training-free method that utilizes token-level confidence in the model's final answers as a proxy for reward.<n>We show that CRew outperforms existing training-free reward approaches on the MATH500 and RewardMATH benchmarks.<n>We propose CRew-DPO, a training strategy that constructs preference data from confidence scores combined with correctness signals.
arXiv Detail & Related papers (2025-10-15T12:51:47Z)
ReFIne: A Framework for Trustworthy Large Reasoning Models with Reliability, Faithfulness, and Interpretability [23.70973331911138]
We argue that usable reasoning systems must be trustworthy, characterized by three properties: interpretability, faithfulness, and reliability.<n>We propose ReFIne, a new training framework that integrates supervised fine-tuning with GRPO to encourage models to improve interpretability.<n>Our experimental results show that ReFIne models generate clearer and better-structured reasoning traces.
arXiv Detail & Related papers (2025-10-10T07:08:44Z)
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation [63.49409574310576]
Large language models (LLMs) exhibit overconfidence, assigning high confidence scores to incorrect predictions.<n>We introduce FineCE, a novel confidence estimation method that delivers accurate, fine-grained confidence scores during text generation.<n>Our code and all baselines used in the paper are available on GitHub.
arXiv Detail & Related papers (2025-08-16T13:29:35Z)
Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks.<n>They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct.<n>We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z)
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness [24.843692458375436]
This study investigates how models aligned with general-purpose preference data perform across five trustworthiness verticals.<n>Our results demonstrate that RLHF on human preferences doesn't automatically guarantee trustworthiness, and reverse effects are often observed.<n>We propose to adapt efficient influence function based data attribution methods to the RLHF setting to better understand the influence of fine-tuning data on individual trustworthiness benchmarks.
arXiv Detail & Related papers (2024-04-29T17:00:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.