Related papers: Measuring Uncertainty in Transformer Circuits with Effective Information Consistency

Measuring Uncertainty in Transformer Circuits with Effective Information Consistency

URL: http://arxiv.org/abs/2509.07149v1
Date: Mon, 08 Sep 2025 18:54:56 GMT
Title: Measuring Uncertainty in Transformer Circuits with Effective Information Consistency
Authors: Anatoly A. Krasnovsky,
Abstract summary: We develop a sheaf/cohomology and causal emergence perspective to Transformer Circuits.<n>EICS combines (i) a normalized sheaf inconsistency computed from local Jacobians and activations, with (ii) a Gaussian EI proxy for circuit-level causal emergence.<n>We provide practical guidance on score interpretation, computational overhead (with fast and exact modes), and a toy sanity-check analysis.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Mechanistic interpretability has identified functional subgraphs within large language models (LLMs), known as Transformer Circuits (TCs), that appear to implement specific algorithms. Yet we lack a formal, single-pass way to quantify when an active circuit is behaving coherently and thus likely trustworthy. Building on prior systems-theoretic proposals, we specialize a sheaf/cohomology and causal emergence perspective to TCs and introduce the Effective-Information Consistency Score (EICS). EICS combines (i) a normalized sheaf inconsistency computed from local Jacobians and activations, with (ii) a Gaussian EI proxy for circuit-level causal emergence derived from the same forward state. The construction is white-box, single-pass, and makes units explicit so that the score is dimensionless. We further provide practical guidance on score interpretation, computational overhead (with fast and exact modes), and a toy sanity-check analysis. Empirical validation on LLM tasks is deferred.

Related papers

TorchLean: Formalizing Neural Networks in Lean [71.68907600404513]
We introduce TorchLean, a framework that treats learned models as first-class mathematical objects with a single, precise semantics shared by execution and verification.<n>We validate TorchLean end-to-end on certified robustness, physics-informed residual bounds for PINNs, and Lyapunov-style neural controller verification.
arXiv Detail & Related papers (2026-02-26T05:11:44Z)
CodeCircuit: Toward Inferring LLM-Generated Code Correctness via Attribution Graphs [13.488544043942495]
We aim to investigate whether the model's neural dynamics encode internally decodable signals that are predictive of logical validity during code generation.<n>By decomposing complex residual flows, we aim to identify the structural signatures that distinguish sound reasoning from logical failure.<n>Analysis across Python, C++, and Java confirms that intrinsic correctness signals are robust across diverse syntaxes.
arXiv Detail & Related papers (2026-02-06T03:49:15Z)
Explaining the Explainer: Understanding the Inner Workings of Transformer-based Symbolic Regression Models [3.7957452405531265]
We introduce PATCHES, an evolutionary circuit discovery algorithm that identifies compact and correct circuits for symbolic regression.<n>Using PATCHES, we isolate 28 circuits, providing the first circuit-level characterisation of an SR transformer.
arXiv Detail & Related papers (2026-02-03T13:27:10Z)
AQER: a scalable and efficient data loader for digital quantum computers [62.40228216126285]
We develop AQER, a scalable AQL method that constructs the loading circuit by systematically reducing entanglement in target states.<n>We conduct systematic experiments to evaluate the effectiveness of AQER, using synthetic datasets, classical image and language datasets, and a quantum many-body state datasets with up to 50 qubits.
arXiv Detail & Related papers (2026-02-02T14:39:42Z)
Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer [65.38883376379812]
We propose the Discrete Transformer, an architecture engineered to bridge the gap between continuous representations and discrete symbolic logic.<n> Empirically, the Discrete Transformer not only achieves performance comparable to RNN-based baselines but crucially extends interpretability to continuous variable domains.
arXiv Detail & Related papers (2026-01-09T12:49:41Z)
Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z)
Logical accreditation: a framework for efficient certification of fault-tolerant computations [1.1068280788997429]
We introduce logical accreditation, a framework for efficiently certifying quantum computations performed on logical qubits.<n>Our protocol is robust against general noise models, far beyond those typically considered in performance analyses of quantum error-correcting codes.
arXiv Detail & Related papers (2025-08-07T15:53:05Z)
Provable In-Context Learning of Nonlinear Regression with Transformers [58.018629320233174]
In-context learning (ICL) is the ability to perform unseen tasks using task-specific prompts without updating parameters.<n>Recent research has actively explored the training dynamics behind ICL.<n>This paper investigates more complex nonlinear regression tasks, aiming to uncover how transformers acquire in-context learning capabilities.
arXiv Detail & Related papers (2025-07-28T00:09:28Z)
A Smooth Transition Between Induction and Deduction: Fast Abductive Learning Based on Probabilistic Symbol Perception [81.30687085692576]
We introduce an optimization algorithm named as Probabilistic Symbol Perception (PSP), which makes a smooth transition between induction and deduction.<n> Experiments demonstrate the promising results.
arXiv Detail & Related papers (2025-02-18T14:59:54Z)
Stabilizer circuit verification [0.0]
We propose a set of efficient classical algorithms to fully characterize and exhaustively verify stabilizer circuits. We provide an algorithm for checking the equivalence of stabilizer circuits. All of our algorithms provide relations of measurement outcomes among corresponding circuit representations.
arXiv Detail & Related papers (2023-09-15T18:06:17Z)
Gaining confidence on the correct realization of arbitrary quantum computations [0.0]
We present verification protocols to gain confidence in the realization of an arbitrary universal quantum computation.<n>The derivation of the protocols is based on the fact that matchgate computations, which are classically efficiently simulable, become universal if supplemented with additional resources.
arXiv Detail & Related papers (2023-08-22T11:47:12Z)
Efficient Computation of Counterfactual Bounds [44.4263314637532]
We compute exact counterfactual bounds via algorithms for credal nets on a subclass of structural causal models. We evaluate their accuracy by providing credible intervals on the quality of the approximation.
arXiv Detail & Related papers (2023-07-17T07:59:47Z)
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL. We show that transformers can implement a broad class of standard machine learning algorithms in context. A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z)
Task-Oriented Sensing, Computation, and Communication Integration for Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC) We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.