Related papers: The Epistemic Suite: A Post-Foundational Diagnostic Methodology for Assessing AI Knowledge Claims

The Epistemic Suite: A Post-Foundational Diagnostic Methodology for Assessing AI Knowledge Claims

URL: http://arxiv.org/abs/2510.24721v1
Date: Sat, 20 Sep 2025 00:29:38 GMT
Title: The Epistemic Suite: A Post-Foundational Diagnostic Methodology for Assessing AI Knowledge Claims
Authors: Matthew Kelly,
Abstract summary: This paper introduces the Epistemic Suite, a diagnostic methodology for surfacing the conditions under which AI outputs are produced and received.<n>Rather than determining truth or falsity, the Suite operates through twenty diagnostic lenses to reveal patterns such as confidence laundering, narrative compression, displaced authority, and temporal drift.
Score: 0.7233897166339268
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) generate fluent, plausible text that can mislead users into mistaking simulated coherence for genuine understanding. This paper introduces the Epistemic Suite, a post-foundational diagnostic methodology for surfacing the epistemic conditions under which AI outputs are produced and received. Rather than determining truth or falsity, the Suite operates through twenty diagnostic lenses, applied by practitioners as context warrants, to reveal patterns such as confidence laundering, narrative compression, displaced authority, and temporal drift. It is grounded in three design principles: diagnosing production before evaluating claims, preferring diagnostic traction over foundational settlement, and embedding reflexivity as a structural requirement rather than an ethical ornament. When enacted, the Suite shifts language models into a diagnostic stance, producing inspectable artifacts-flags, annotations, contradiction maps, and suspension logs (the FACS bundle)-that create an intermediary layer between AI output and human judgment. A key innovation is epistemic suspension, a practitioner-enacted circuit breaker that halts continuation when warrant is exceeded, with resumption based on judgment rather than rule. The methodology also includes an Epistemic Triage Protocol and a Meta-Governance Layer to manage proportionality and link activation to relational accountability, consent, historical context, and pluralism safeguards. Unlike internalist approaches that embed alignment into model architectures (e.g., RLHF or epistemic-integrity proposals), the Suite operates externally as scaffolding, preserving expendability and refusal as safeguards rather than failures. It preserves the distinction between performance and understanding, enabling accountable deliberation while maintaining epistemic modesty.

Related papers

Causality is Key for Interpretability Claims to Generalise [35.833847356014154]
Interpretability research on large language models (LLMs) has yielded important insights into model behaviour.<n> recurring pitfalls persist: findings that do not generalise, and causal interpretations that outrun the evidence.<n>Pearl's causal hierarchy clarifies what an interpretability study can justify.
arXiv Detail & Related papers (2026-02-18T18:45:04Z)
Benchmarking Egocentric Clinical Intent Understanding Capability for Medical Multimodal Large Language Models [48.95516224614331]
We introduce MedGaze-Bench, the first benchmark leveraging clinician gaze as a Cognitive Cursor to assess intent understanding across surgery, emergency simulation, and diagnostic interpretation.<n>Our benchmark addresses three fundamental challenges: visual homogeneity of anatomical structures, strict temporal-causal dependencies in clinical, and implicit adherence to safety protocols.
arXiv Detail & Related papers (2026-01-11T02:20:40Z)
EHRSummarizer: A Privacy-Aware, FHIR-Native Architecture for Structured Clinical Summarization of Electronic Health Records [0.0]
EHRSummarizer produces structured summaries to support structured chart review.<n>System can be configured for data minimization, stateless processing, and flexible deployment.
arXiv Detail & Related papers (2026-01-04T21:10:42Z)
Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs [85.69785384599827]
Human-object interaction (HOI) detection aims to localize human-object pairs and the interactions between them.<n>Existing methods operate under a closed-world assumption, treating the task as a classification problem over a small, predefined verb set.<n>We propose GRASP-HO, a novel Generative Reasoning And Steerable Perception framework that reformulates HOI detection from the closed-set classification task to the open-vocabulary generation problem.
arXiv Detail & Related papers (2025-12-19T14:41:50Z)
CARE-RAG - Clinical Assessment and Reasoning in RAG [43.1450755645803]
We study the gap between retrieval and reasoning in large language models (LLMs)<n>We propose an evaluation framework that measures accuracy, consistency, and fidelity of reasoning.
arXiv Detail & Related papers (2025-11-20T02:44:55Z)
Ensemble Deep Learning and LLM-Assisted Reporting for Automated Skin Lesion Diagnosis [2.9307254086347427]
We introduce a unified framework that reimagines AI integration for dermatological diagnostics.<n>First, a purposefully heterogeneous ensemble of architecturally diverse convolutional neural networks provides complementary diagnostic perspectives.<n>Second, we embed large language model capabilities directly into the diagnostic workflow, transforming classification outputs into clinically meaningful assessments.
arXiv Detail & Related papers (2025-10-05T08:07:33Z)
RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis [56.373297358647655]
Retrieval-Augmented Diagnosis (RAD) is a novel framework that injects external knowledge into multimodal models directly on downstream tasks.<n>RAD operates through three key mechanisms: retrieval and refinement of disease-centered knowledge from multiple medical sources, a guideline-enhanced contrastive loss transformer, and a dual decoder.
arXiv Detail & Related papers (2025-09-24T10:36:14Z)
Are All Prompt Components Value-Neutral? Understanding the Heterogeneous Adversarial Robustness of Dissected Prompt in Large Language Models [11.625319498017733]
We introduce PromptAnatomy, an automated framework that dissects prompts into functional components.<n>We generate adversarial examples by selectively perturbing each component using our proposed method, ComPerturb.<n>As a complementary resource, we annotate four public instruction-tuning datasets using the PromptAnatomy framework.
arXiv Detail & Related papers (2025-08-03T02:46:30Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts.<n>Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z)
Secure Diagnostics: Adversarial Robustness Meets Clinical Interpretability [9.522045116604358]
Deep neural networks for medical image classification often fail to generalize consistently in clinical practice.<n>This paper examines interpretability in deep neural networks fine-tuned for fracture detection by evaluating model performance against adversarial attack.
arXiv Detail & Related papers (2025-04-07T20:26:02Z)
Epistemic Closure and the Irreversibility of Misalignment: Modeling Systemic Barriers to Alignment Innovation [0.0]
Efforts to ensure the safe development of artificial general intelligence often rely on consensus-based alignment approaches.<n>This paper introduces a functional model of epistemic closure, in which cognitive, institutional, social, and infrastructural filters combine to make alignment proposals illegible.<n>We present a weighted closure model supported by both theoretical and empirical sources, including a meta-analysis performed by an AI system on patterns of rejection and non-engagement.
arXiv Detail & Related papers (2025-04-02T18:35:15Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
A Semantic Approach to Decidability in Epistemic Planning (Extended Version) [72.77805489645604]
We use a novel semantic approach to achieve decidability. Specifically, we augment the logic of knowledge S5$_n$ and with an interaction axiom called (knowledge) commutativity. We prove that our framework admits a finitary non-fixpoint characterization of common knowledge, which is of independent interest.
arXiv Detail & Related papers (2023-07-28T11:26:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.