A Compositional Atlas of Tractable Circuit Operations: From Simple
Transformations to Complex Information-Theoretic Queries
- URL: http://arxiv.org/abs/2102.06137v1
- Date: Thu, 11 Feb 2021 17:26:32 GMT
- Title: A Compositional Atlas of Tractable Circuit Operations: From Simple
Transformations to Complex Information-Theoretic Queries
- Authors: Antonio Vergari, YooJung Choi, Anji Liu, Stefano Teso, Guy Van den
Broeck
- Abstract summary: We show how complex inference scenarios for machine learning can be represented in terms of tractable modular operations over circuits.
We derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.
- Score: 44.36335714431731
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Circuit representations are becoming the lingua franca to express and reason
about tractable generative and discriminative models. In this paper, we show
how complex inference scenarios for these models that commonly arise in machine
learning -- from computing the expectations of decision tree ensembles to
information-theoretic divergences of deep mixture models -- can be represented
in terms of tractable modular operations over circuits. Specifically, we
characterize the tractability of a vocabulary of simple transformations --
sums, products, quotients, powers, logarithms, and exponentials -- in terms of
sufficient structural constraints of the circuits they operate on, and present
novel hardness results for the cases in which these properties are not
satisfied. Building on these operations, we derive a unified framework for
reasoning about tractable models that generalizes several results in the
literature and opens up novel tractable inference scenarios.
Related papers
- Hidden Holes: topological aspects of language models [1.1172147007388977]
We study the evolution of topological structure in GPT based large language models across depth and time during training.
We show that the latter exhibit more topological complexity, with a distinct pattern of changes common to all natural languages but absent from synthetically generated data.
arXiv Detail & Related papers (2024-06-09T14:25:09Z) - Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation [52.77133661679439]
Investigating internal reasoning mechanisms of large language models can help us design better model architectures and training strategies.
We investigate the matching mechanism employed by Transformer for multi-step reasoning on a constructed dataset.
We propose a conjecture on the upper bound of the model's reasoning ability based on this phenomenon.
arXiv Detail & Related papers (2024-05-24T07:41:26Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Shape Arithmetic Expressions: Advancing Scientific Discovery Beyond Closed-Form Equations [56.78271181959529]
Generalized Additive Models (GAMs) can capture non-linear relationships between variables and targets, but they cannot capture intricate feature interactions.
We propose Shape Expressions Arithmetic ( SHAREs) that fuses GAM's flexible shape functions with the complex feature interactions found in mathematical expressions.
We also design a set of rules for constructing SHAREs that guarantee transparency of the found expressions beyond the standard constraints.
arXiv Detail & Related papers (2024-04-15T13:44:01Z) - Structured World Representations in Maze-Solving Transformers [3.75591091941815]
This work focuses on the abstractions formed by small transformer models.
We find evidence for the consistent emergence of structured internal representations of maze topology and valid paths.
We also take steps towards deciphering the circuity of path-following by identifying attention heads.
arXiv Detail & Related papers (2023-12-05T08:24:26Z) - Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models [9.56229382432426]
This research aims to reverse engineer transformer models into human-readable representations that implement algorithmic functions.
By applying circuit interpretability analysis, we identify a key sub-circuit in both GPT-2 Small and Llama-2-7B.
We show that this sub-circuit has effects on various math-related prompts, such as on intervaled circuits, Spanish number word and months continuation, and natural language word problems.
arXiv Detail & Related papers (2023-11-07T16:58:51Z) - Query Structure Modeling for Inductive Logical Reasoning Over Knowledge
Graphs [67.043747188954]
We propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs.
It encodes linearized query structures and entities using pre-trained language models to find answers.
We conduct experiments on two inductive logical reasoning datasets and three transductive datasets.
arXiv Detail & Related papers (2023-05-23T01:25:29Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z) - Transformer Grammars: Augmenting Transformer Language Models with
Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers.
We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.