Related papers: A Compositional Atlas of Tractable Circuit Operations: From Simple Transformations to Complex Information-Theoretic Queries

A Compositional Atlas of Tractable Circuit Operations: From Simple Transformations to Complex Information-Theoretic Queries

URL: http://arxiv.org/abs/2102.06137v1
Date: Thu, 11 Feb 2021 17:26:32 GMT
Title: A Compositional Atlas of Tractable Circuit Operations: From Simple Transformations to Complex Information-Theoretic Queries
Authors: Antonio Vergari, YooJung Choi, Anji Liu, Stefano Teso, Guy Van den Broeck
Abstract summary: We show how complex inference scenarios for machine learning can be represented in terms of tractable modular operations over circuits. We derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.
Score: 44.36335714431731
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Circuit representations are becoming the lingua franca to express and reason about tractable generative and discriminative models. In this paper, we show how complex inference scenarios for these models that commonly arise in machine learning -- from computing the expectations of decision tree ensembles to information-theoretic divergences of deep mixture models -- can be represented in terms of tractable modular operations over circuits. Specifically, we characterize the tractability of a vocabulary of simple transformations -- sums, products, quotients, powers, logarithms, and exponentials -- in terms of sufficient structural constraints of the circuits they operate on, and present novel hardness results for the cases in which these properties are not satisfied. Building on these operations, we derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.

Related papers

An explainable transformer circuit for compositional generalization [4.446278061385101]
We identify and mechanistically interpret the circuit responsible for compositional induction in a compact transformer. Using causal ablations, we validate the circuit and formalize its operation using a program-like description. Our findings advance the understanding of complex behaviors in transformers and highlight such insights can provide a direct pathway for model control.
arXiv Detail & Related papers (2025-02-19T02:30:41Z)
Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers [10.206921909332006]
This study investigates the internal mechanisms underlying Transformers' behavior in compositional tasks. We find that complexity control strategies influence whether the model learns primitive-level rules that generalize out-of-distribution (reasoning-based solutions) or relies solely on memorized mappings (memory-based solutions)
arXiv Detail & Related papers (2025-01-15T02:54:52Z)
A Compositional Atlas for Algebraic Circuits [35.95450187283255]
We show that a large class of queries correspond to a combination of basic operators over semirings: aggregation, product, and elementwise mapping. Applying our analysis, we derive novel tractability conditions for many such compositional queries.
arXiv Detail & Related papers (2024-12-07T00:51:46Z)
Toward Understanding In-context vs. In-weight Learning [50.24035812301655]
We identify simplified distributional properties that give rise to the emergence and disappearance of in-context learning. We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.
arXiv Detail & Related papers (2024-10-30T14:09:00Z)
Interpreting token compositionality in LLMs: A robustness analysis [10.777646083061395]
Constituent-Aware Pooling (CAP) is a methodology designed to analyse how large language models process linguistic structures. CAP intervenes in model activations through constituent-based pooling at various model levels.
arXiv Detail & Related papers (2024-10-16T18:10:50Z)
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models [22.89563355840371]
We identify and compare circuits responsible for ten modular string-edit operations within a language model. Our results indicate that functionally similar circuits exhibit both notable node overlap and cross-task faithfulness.
arXiv Detail & Related papers (2024-10-02T11:36:45Z)
What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)? [12.53042167016897]
We introduce a modular "Lego block" approach to build tensorized circuit architectures. This connection not only clarifies similarities and differences in existing models, but also enables the development of a comprehensive pipeline.
arXiv Detail & Related papers (2024-09-12T11:32:01Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Shape Arithmetic Expressions: Advancing Scientific Discovery Beyond Closed-Form Equations [56.78271181959529]
Generalized Additive Models (GAMs) can capture non-linear relationships between variables and targets, but they cannot capture intricate feature interactions. We propose Shape Expressions Arithmetic ( SHAREs) that fuses GAM's flexible shape functions with the complex feature interactions found in mathematical expressions. We also design a set of rules for constructing SHAREs that guarantee transparency of the found expressions beyond the standard constraints.
arXiv Detail & Related papers (2024-04-15T13:44:01Z)
Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials [29.09237503747052]
Grokking on modular addition has been known to implement Fourier representation and its calculation circuits with trigonometric identities in Transformers. We show that the transferability among the models grokked with each operation can be only limited to specific combinations. Some multi-task mixtures may lead to co-grokking, where grokking simultaneously happens for all the tasks.
arXiv Detail & Related papers (2024-02-26T16:48:12Z)
Structured World Representations in Maze-Solving Transformers [3.75591091941815]
This work focuses on the abstractions formed by small transformer models. We find evidence for the consistent emergence of structured internal representations of maze topology and valid paths. We also take steps towards deciphering the circuity of path-following by identifying attention heads.
arXiv Detail & Related papers (2023-12-05T08:24:26Z)
Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs [67.043747188954]
We propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs. It encodes linearized query structures and entities using pre-trained language models to find answers. We conduct experiments on two inductive logical reasoning datasets and three transductive datasets.
arXiv Detail & Related papers (2023-05-23T01:25:29Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.