Folding over Neural Networks
- URL: http://arxiv.org/abs/2207.01090v1
- Date: Sun, 3 Jul 2022 18:20:05 GMT
- Title: Folding over Neural Networks
- Authors: Minh Nguyen and Nicolas Wu
- Abstract summary: This paper shows how structured recursion can be used to represent neural networks in Haskell.
In turn, we promote a coherent implementation of neural networks that delineates between their structure and semantics.
- Score: 1.7818230914983044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks are typically represented as data structures that are
traversed either through iteration or by manual chaining of method calls.
However, a deeper analysis reveals that structured recursion can be used
instead, so that traversal is directed by the structure of the network itself.
This paper shows how such an approach can be realised in Haskell, by encoding
neural networks as recursive data types, and then their training as recursion
scheme patterns. In turn, we promote a coherent implementation of neural
networks that delineates between their structure and semantics, allowing for
compositionality in both how they are built and how they are trained.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Can Transformers Learn to Solve Problems Recursively? [9.5623664764386]
This paper examines the behavior of neural networks learning algorithms relevant to programs and formal verification.
By reconstructing these algorithms, we are able to correctly predict 91 percent of failure cases for one of the approximated functions.
arXiv Detail & Related papers (2023-05-24T04:08:37Z) - Bayesian Detection of Mesoscale Structures in Pathway Data on Graphs [0.0]
mesoscale structures are integral part of the abstraction and analysis of complex systems.
They can represent communities in social or citation networks, roles in corporate interactions, or core-periphery structures in transportation networks.
We derive a Bayesian approach that simultaneously models the optimal partitioning of nodes in groups and the optimal higher-order network dynamics.
arXiv Detail & Related papers (2023-01-16T12:45:33Z) - Credit Assignment for Trained Neural Networks Based on Koopman Operator
Theory [3.130109807128472]
Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs.
This paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks.
Experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-02T06:34:27Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order.
We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Artificial Neural Networks generated by Low Discrepancy Sequences [59.51653996175648]
We generate artificial neural networks as random walks on a dense network graph.
Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards.
We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
arXiv Detail & Related papers (2021-03-05T08:45:43Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Progressive Graph Convolutional Networks for Semi-Supervised Node
Classification [97.14064057840089]
Graph convolutional networks have been successful in addressing graph-based tasks such as semi-supervised node classification.
We propose a method to automatically build compact and task-specific graph convolutional networks.
arXiv Detail & Related papers (2020-03-27T08:32:16Z) - Investigating the Compositional Structure Of Deep Neural Networks [1.8899300124593645]
We introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions.
It is possible to characterize the instances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions.
Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network.
arXiv Detail & Related papers (2020-02-17T14:16:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.