Generating Interpretable Networks using Hypernetworks
- URL: http://arxiv.org/abs/2312.03051v1
- Date: Tue, 5 Dec 2023 18:55:32 GMT
- Title: Generating Interpretable Networks using Hypernetworks
- Authors: Isaac Liao, Ziming Liu, Max Tegmark
- Abstract summary: We explore the possibility of using hypernetworks to generate interpretable networks whose underlying algorithms are not yet known.
For the task of computing L1 norms, hypernetworks find three algorithms: (a) the double-sided algorithm, (b) the convexity algorithm, (c) the pudding algorithm.
We show that a trained hypernetwork can correctly construct models for input dimensions not seen in training, demonstrating systematic generalization.
- Score: 16.876961991785507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An essential goal in mechanistic interpretability to decode a network, i.e.,
to convert a neural network's raw weights to an interpretable algorithm. Given
the difficulty of the decoding problem, progress has been made to understand
the easier encoding problem, i.e., to convert an interpretable algorithm into
network weights. Previous works focus on encoding existing algorithms into
networks, which are interpretable by definition. However, focusing on encoding
limits the possibility of discovering new algorithms that humans have never
stumbled upon, but that are nevertheless interpretable. In this work, we
explore the possibility of using hypernetworks to generate interpretable
networks whose underlying algorithms are not yet known. The hypernetwork is
carefully designed such that it can control network complexity, leading to a
diverse family of interpretable algorithms ranked by their complexity. All of
them are interpretable in hindsight, although some of them are less intuitive
to humans, hence providing new insights regarding how to "think" like a neural
network. For the task of computing L1 norms, hypernetworks find three
algorithms: (a) the double-sided algorithm, (b) the convexity algorithm, (c)
the pudding algorithm, although only the first algorithm was expected by the
authors before experiments. We automatically classify these algorithms and
analyze how these algorithmic phases develop during training, as well as how
they are affected by complexity control. Furthermore, we show that a trained
hypernetwork can correctly construct models for input dimensions not seen in
training, demonstrating systematic generalization.
Related papers
- The Deep Equilibrium Algorithmic Reasoner [20.375241527453447]
We show that graph neural networks (GNNs) can learn to execute classical algorithms.
We conjecture and empirically validate that one can train a network to solve algorithmic problems by directly finding the equilibrium.
arXiv Detail & Related papers (2024-02-09T14:46:50Z) - Convolutional neural network based decoders for surface codes [0.0]
This work reports a study of decoders based on convolutional neural networks, tested on different code distances and noise models.
The results show that decoders based on convolutional neural networks have good performance and can adapt to different noise models.
arXiv Detail & Related papers (2023-12-06T14:07:31Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - Graph Neural Networks are Dynamic Programmers [0.0]
Graph neural networks (GNNs) are claimed to align with dynamic programming (DP)
Here we show, using methods from theory and abstract algebra, that there exists an intricate connection between GNNs and DP.
arXiv Detail & Related papers (2022-03-29T13:27:28Z) - Artificial Neural Networks generated by Low Discrepancy Sequences [59.51653996175648]
We generate artificial neural networks as random walks on a dense network graph.
Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards.
We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
arXiv Detail & Related papers (2021-03-05T08:45:43Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z) - Lagrangian Decomposition for Neural Network Verification [148.0448557991349]
A fundamental component of neural network verification is the computation of bounds on the values their outputs can take.
We propose a novel approach based on Lagrangian Decomposition.
We show that we obtain bounds comparable with off-the-shelf solvers in a fraction of their running time.
arXiv Detail & Related papers (2020-02-24T17:55:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.