IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck
- URL: http://arxiv.org/abs/2602.22581v1
- Date: Thu, 26 Feb 2026 03:33:35 GMT
- Title: IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck
- Authors: Tian Bian, Yifan Niu, Chaohao Yuan, Chengzhi Piao, Bingzhe Wu, Long-Kai Huang, Yu Rong, Tingyang Xu, Hong Cheng, Jia Li,
- Abstract summary: We propose an end-to-end approach based on the principle of Information Bottleneck, called IBCircuit, to identify informative circuits holistically.<n> IBCircuit is an optimization framework for holistic circuit discovery and can be applied to any given task without tediously corrupted activation design.
- Score: 39.572087058128645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Circuit discovery has recently attracted attention as a potential research direction to explain the non-trivial behaviors of language models. It aims to find the computational subgraphs, also known as circuits, within the model that are responsible for solving specific tasks. However, most existing studies overlook the holistic nature of these circuits and require designing specific corrupted activations for different tasks, which is inaccurate and inefficient. In this work, we propose an end-to-end approach based on the principle of Information Bottleneck, called IBCircuit, to identify informative circuits holistically. IBCircuit is an optimization framework for holistic circuit discovery and can be applied to any given task without tediously corrupted activation design. In both the Indirect Object Identification (IOI) and Greater-Than tasks, IBCircuit identifies more faithful and minimal circuits in terms of critical node components and edge components compared to recent related work.
Related papers
- Certified Circuits: Stability Guarantees for Mechanistic Circuits [80.30622018787835]
Certified Circuits provides provable stability guarantees for circuit discovery.<n>On ImageNet and OOD datasets, certified circuits achieve up to 91% higher accuracy.
arXiv Detail & Related papers (2026-02-26T13:07:31Z) - Adapting, Fast and Slow: Transportable Circuits for Few-Shot Learning [54.930879235929204]
Generalization across the domains is not possible without asserting a structure that constrains the unseen target domain w.r.t.<n>We design an algorithm for zero-shot compositional generalization which relies on access to qualitative domain knowledge.<n>Our theoretical results characterize classes of few-shot learnable tasks in terms of graphical circuit transportability criteria.
arXiv Detail & Related papers (2025-12-28T04:38:43Z) - Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates [35.90665719234101]
We introduce three types of logic gates: AND, OR, and ADDER gates, and decompose the circuit into combinations of these logical gates.<n>We propose a framework that combines noising-based and denoising-based interventions, which can be easily integrated into existing circuit discovery methods.
arXiv Detail & Related papers (2025-05-15T07:35:14Z) - Position-aware Automatic Circuit Discovery [59.64762573617173]
We identify a gap in existing circuit discovery methods, treating model components as equally relevant across input positions.<n>We propose two improvements to incorporate positionality into circuits, even on tasks containing variable-length examples.<n>Our approach enables fully automated discovery of position-sensitive circuits, yielding better trade-offs between circuit size and faithfulness compared to prior work.
arXiv Detail & Related papers (2025-02-07T00:18:20Z) - Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models [22.89563355840371]
We study the modularity of neural networks by analyzing circuits for highly compositional subtasks within a language model.<n>Our results indicate that functionally similar circuits exhibit both notable node overlap and cross-task faithfulness.
arXiv Detail & Related papers (2024-10-02T11:36:45Z) - Transformer Circuit Faithfulness Metrics are not Robust [0.04260910081285213]
We measure circuit 'faithfulness' by ablating portions of the model's computation.
We conclude that existing circuit faithfulness scores reflect both the methodological choices of researchers as well as the actual components of the circuit.
The ultimate goal of mechanistic interpretability work is to understand neural networks, so we emphasize the need for more clarity in the precise claims being made about circuits.
arXiv Detail & Related papers (2024-07-11T17:59:00Z) - Sheaf Discovery with Joint Computation Graph Pruning and Flexible Granularity [18.71252449465396]
We introduce DiscoGP, a framework for extracting self-contained modular units from neural language models (LMs)<n>Our framework identifies sheaves through a gradient-based pruning algorithm that operates on both of these in such a way that reduces the original LM to a sparse skeleton that preserves certain core capabilities.
arXiv Detail & Related papers (2024-07-04T09:42:25Z) - CktGNN: Circuit Graph Neural Network for Electronic Design Automation [67.29634073660239]
This paper presents a Circuit Graph Neural Network (CktGNN) that simultaneously automates the circuit topology generation and device sizing.
We introduce Open Circuit Benchmark (OCB), an open-sourced dataset that contains $10$K distinct operational amplifiers.
Our work paves the way toward a learning-based open-sourced design automation for analog circuits.
arXiv Detail & Related papers (2023-08-31T02:20:25Z) - Adaptive Planning Search Algorithm for Analog Circuit Verification [53.97809573610992]
We propose a machine learning (ML) approach, which uses less simulations.
We show that the proposed approach is able to provide OCCs closer to the specifications for all circuits.
arXiv Detail & Related papers (2023-06-23T12:57:46Z) - Circuit Routing Using Monte Carlo Tree Search and Deep Neural Networks [1.987599364275123]
We model the circuit routing as a sequential decision-making problem, and solve it by Monte Carlo tree search (MCTS) with deep neural network (DNN) guided rollout.
Experiments on randomly generated single-layer circuits show the potential to route complex circuits.
The proposed approach can solve the problems that benchmark methods such as sequential A* method and Lee's algorithm cannot solve, and can also outperform the vanilla MCTS approach.
arXiv Detail & Related papers (2020-06-24T10:34:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.