On Logical Extrapolation for Mazes with Recurrent and Implicit Networks
- URL: http://arxiv.org/abs/2410.03020v1
- Date: Thu, 3 Oct 2024 22:07:51 GMT
- Title: On Logical Extrapolation for Mazes with Recurrent and Implicit Networks
- Authors: Brandon Knutson, Amandin Chyba Rabeendran, Michael Ivanitskiy, Jordan Pettyjohn, Cecilia Diniz-Behn, Samy Wu Fung, Daniel McKenzie,
- Abstract summary: We show that the capacity for extrapolation is less robust than previously suggested.
We show that while INNs are capable of generalizing to larger maze instances, they fail to generalize along axes of difficulty other than maze size.
- Score: 2.0037131645168396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has suggested that certain neural network architectures-particularly recurrent neural networks (RNNs) and implicit neural networks (INNs) are capable of logical extrapolation. That is, one may train such a network on easy instances of a specific task and then apply it successfully to more difficult instances of the same task. In this paper, we revisit this idea and show that (i) The capacity for extrapolation is less robust than previously suggested. Specifically, in the context of a maze-solving task, we show that while INNs (and some RNNs) are capable of generalizing to larger maze instances, they fail to generalize along axes of difficulty other than maze size. (ii) Models that are explicitly trained to converge to a fixed point (e.g. the INN we test) are likely to do so when extrapolating, while models that are not (e.g. the RNN we test) may exhibit more exotic limiting behaviour such as limit cycles, even when they correctly solve the problem. Our results suggest that (i) further study into why such networks extrapolate easily along certain axes of difficulty yet struggle with others is necessary, and (ii) analyzing the dynamics of extrapolation may yield insights into designing more efficient and interpretable logical extrapolators.
Related papers
- GINN-KAN: Interpretability pipelining with applications in Physics Informed Neural Networks [5.2969467015867915]
We introduce the concept of interpretability pipelineing, to incorporate multiple interpretability techniques to outperform each individual technique.
We evaluate two recent models selected for their potential to incorporate interpretability into standard neural network architectures.
We introduce a novel interpretable neural network GINN-KAN that synthesizes the advantages of both models.
arXiv Detail & Related papers (2024-08-27T04:57:53Z) - Pursing the Sparse Limitation of Spiking Deep Learning Structures [42.334835610250714]
Spiking Neural Networks (SNNs) are garnering increased attention for their superior computation and energy efficiency.
We introduce an innovative algorithm capable of simultaneously identifying both weight and patch-level winning tickets.
We demonstrate that our spiking lottery ticket achieves comparable or superior performance even when the model structure is extremely sparse.
arXiv Detail & Related papers (2023-11-18T17:00:40Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Neural Networks and the Chomsky Hierarchy [27.470857324448136]
We study whether insights from the theory of Chomsky can predict the limits of neural network generalization in practice.
We show negative results where even extensive amounts of data and training time never led to any non-trivial generalization.
Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, and only networks augmented with structured memory can successfully generalize on context-free and context-sensitive tasks.
arXiv Detail & Related papers (2022-07-05T15:06:11Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - SpikeMS: Deep Spiking Neural Network for Motion Segmentation [7.491944503744111]
textitSpikeMS is the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation.
We show that textitSpikeMS is capable of textitincremental predictions, or predictions from smaller amounts of test data than it is trained on.
arXiv Detail & Related papers (2021-05-13T21:34:55Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Efficient Probabilistic Logic Reasoning with Graph Neural Networks [63.099999467118245]
Markov Logic Networks (MLNs) can be used to address many knowledge graph problems.
Inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult.
We propose a graph neural network (GNN) variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model.
arXiv Detail & Related papers (2020-01-29T23:34:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.