Modeling Structure with Undirected Neural Networks
- URL: http://arxiv.org/abs/2202.03760v1
- Date: Tue, 8 Feb 2022 10:06:51 GMT
- Title: Modeling Structure with Undirected Neural Networks
- Authors: Tsvetomila Mihaylova, Vlad Niculae, Andr\'e F. T. Martins
- Abstract summary: We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order.
We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
- Score: 20.506232306308977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks are powerful function estimators, leading to their status as
a paradigm of choice for modeling structured data. However, unlike other
structured representations that emphasize the modularity of the problem --
e.g., factor graphs -- neural networks are usually monolithic mappings from
inputs to outputs, with a fixed computation order. This limitation prevents
them from capturing different directions of computation and interaction between
the modeled variables.
In this paper, we combine the representational strengths of factor graphs and
of neural networks, proposing undirected neural networks (UNNs): a flexible
framework for specifying computations that can be performed in any order. For
particular choices, our proposed models subsume and extend many existing
architectures: feed-forward, recurrent, self-attention networks, auto-encoders,
and networks with implicit layers. We demonstrate the effectiveness of
undirected neural architectures, both unstructured and structured, on a range
of tasks: tree-constrained dependency parsing, convolutional image
classification, and sequence completion with attention. By varying the
computation order, we show how a single UNN can be used both as a classifier
and a prototype generator, and how it can fill in missing parts of an input
sequence, making them a promising field for further research.
Related papers
- Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Structured Neural Networks for Density Estimation and Causal Inference [15.63518195860946]
Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs.
We propose the Structured Neural Network (StrNN), which injects structure through masking pathways in a neural network.
arXiv Detail & Related papers (2023-11-03T20:15:05Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Random Graph-Based Neuromorphic Learning with a Layer-Weaken Structure [4.477401614534202]
We transform the random graph theory into an NN model with practical meaning and based on clarifying the input-output relationship of each neuron.
Under the usage of this low-operation cost approach, neurons are assigned to several groups of which connection relationships can be regarded as uniform representations of random graphs they belong to.
We develop a joint classification mechanism involving information interaction between multiple RGNNs and realize significant performance improvements in supervised learning for three benchmark tasks.
arXiv Detail & Related papers (2021-11-17T03:37:06Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.