A Compositional Atlas of Tractable Circuit Operations: From Simple
Transformations to Complex Information-Theoretic Queries
- URL: http://arxiv.org/abs/2102.06137v1
- Date: Thu, 11 Feb 2021 17:26:32 GMT
- Title: A Compositional Atlas of Tractable Circuit Operations: From Simple
Transformations to Complex Information-Theoretic Queries
- Authors: Antonio Vergari, YooJung Choi, Anji Liu, Stefano Teso, Guy Van den
Broeck
- Abstract summary: We show how complex inference scenarios for machine learning can be represented in terms of tractable modular operations over circuits.
We derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.
- Score: 44.36335714431731
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Circuit representations are becoming the lingua franca to express and reason
about tractable generative and discriminative models. In this paper, we show
how complex inference scenarios for these models that commonly arise in machine
learning -- from computing the expectations of decision tree ensembles to
information-theoretic divergences of deep mixture models -- can be represented
in terms of tractable modular operations over circuits. Specifically, we
characterize the tractability of a vocabulary of simple transformations --
sums, products, quotients, powers, logarithms, and exponentials -- in terms of
sufficient structural constraints of the circuits they operate on, and present
novel hardness results for the cases in which these properties are not
satisfied. Building on these operations, we derive a unified framework for
reasoning about tractable models that generalizes several results in the
literature and opens up novel tractable inference scenarios.
Related papers
- Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers [10.206921909332006]
This study investigates the internal mechanisms underlying Transformers' behavior in compositional tasks.
We find that complexity control strategies influence whether the model learns primitive-level rules that generalize out-of-distribution (reasoning-based solutions) or relies solely on memorized mappings (memory-based solutions)
arXiv Detail & Related papers (2025-01-15T02:54:52Z) - A Compositional Atlas for Algebraic Circuits [35.95450187283255]
We show that a large class of queries correspond to a combination of basic operators over semirings: aggregation, product, and elementwise mapping.
Applying our analysis, we derive novel tractability conditions for many such compositional queries.
arXiv Detail & Related papers (2024-12-07T00:51:46Z) - Toward Understanding In-context vs. In-weight Learning [50.24035812301655]
We identify simplified distributional properties that give rise to the emergence and disappearance of in-context learning.
We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.
arXiv Detail & Related papers (2024-10-30T14:09:00Z) - Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models [22.89563355840371]
We study the modularity of neural networks by analyzing circuits for highly compositional subtasks within a language model.
Our results indicate that functionally similar circuits exhibit both notable node overlap and cross-task faithfulness.
arXiv Detail & Related papers (2024-10-02T11:36:45Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Shape Arithmetic Expressions: Advancing Scientific Discovery Beyond Closed-Form Equations [56.78271181959529]
Generalized Additive Models (GAMs) can capture non-linear relationships between variables and targets, but they cannot capture intricate feature interactions.
We propose Shape Expressions Arithmetic ( SHAREs) that fuses GAM's flexible shape functions with the complex feature interactions found in mathematical expressions.
We also design a set of rules for constructing SHAREs that guarantee transparency of the found expressions beyond the standard constraints.
arXiv Detail & Related papers (2024-04-15T13:44:01Z) - Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials [29.09237503747052]
Grokking on modular addition has been known to implement Fourier representation and its calculation circuits with trigonometric identities in Transformers.
We show that the transferability among the models grokked with each operation can be only limited to specific combinations.
Some multi-task mixtures may lead to co-grokking, where grokking simultaneously happens for all the tasks.
arXiv Detail & Related papers (2024-02-26T16:48:12Z) - Uncovering Intermediate Variables in Transformers using Circuit Probing [28.81226181942661]
We propose a new analysis technique - circuit probing - that automatically uncovers low-level circuits that compute hypothesized intermediate variables.
We apply this method to models trained on simple arithmetic tasks, demonstrating its effectiveness at (1) deciphering the algorithms that models have learned, (2) revealing modular structure within a model, and (3) tracking the development of circuits over training.
arXiv Detail & Related papers (2023-11-07T21:27:17Z) - Query Structure Modeling for Inductive Logical Reasoning Over Knowledge
Graphs [67.043747188954]
We propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs.
It encodes linearized query structures and entities using pre-trained language models to find answers.
We conduct experiments on two inductive logical reasoning datasets and three transductive datasets.
arXiv Detail & Related papers (2023-05-23T01:25:29Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.