Learning to learn generative programs with Memoised Wake-Sleep
- URL: http://arxiv.org/abs/2007.03132v2
- Date: Wed, 22 Jul 2020 22:36:04 GMT
- Title: Learning to learn generative programs with Memoised Wake-Sleep
- Authors: Luke B. Hewitt and Tuan Anh Le and Joshua B. Tenenbaum
- Abstract summary: We study a class of neuro-symbolic generative models in which neural networks are used both for inference and as priors over symbolic, data-generating programs.
We propose the Memoised Wake-Sleep (MWS) algorithm, which extends Wake Sleep by explicitly storing and reusing the best programs discovered by the inference network throughout training.
- Score: 52.439550543743536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a class of neuro-symbolic generative models in which neural networks
are used both for inference and as priors over symbolic, data-generating
programs. As generative models, these programs capture compositional structures
in a naturally explainable form. To tackle the challenge of performing program
induction as an 'inner-loop' to learning, we propose the Memoised Wake-Sleep
(MWS) algorithm, which extends Wake Sleep by explicitly storing and reusing the
best programs discovered by the inference network throughout training. We use
MWS to learn accurate, explainable models in three challenging domains:
stroke-based character modelling, cellular automata, and few-shot learning in a
novel dataset of real-world string concepts.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Continual Zero-Shot Learning through Semantically Guided Generative
Random Walks [56.65465792750822]
We address the challenge of continual zero-shot learning where unseen information is not provided during training, by leveraging generative modeling.
We propose our learning algorithm that employs a novel semantically guided Generative Random Walk (GRW) loss.
Our algorithm achieves state-of-the-art performance on AWA1, AWA2, CUB, and SUN datasets, surpassing existing CZSL methods by 3-7%.
arXiv Detail & Related papers (2023-08-23T18:10:12Z) - Understanding Activation Patterns in Artificial Neural Networks by
Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far.
We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains.
We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - Hebbian Continual Representation Learning [9.54473759331265]
Continual Learning aims to bring machine learning into a more realistic scenario.
We investigate whether biologically inspired Hebbian learning is useful for tackling continual challenges.
arXiv Detail & Related papers (2022-06-28T09:21:03Z) - Assemble Foundation Models for Automatic Code Summarization [9.53949558569201]
We propose a flexible and robust approach for automatic code summarization based on neural networks.
We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo.
We introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning.
arXiv Detail & Related papers (2022-01-13T21:38:33Z) - Embedding Symbolic Temporal Knowledge into Deep Sequential Models [21.45383857094518]
Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning.
Deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources.
We construct semantic-based embeddings of automata generated from formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.
arXiv Detail & Related papers (2021-01-28T13:17:46Z) - On the Generalizability of Neural Program Models with respect to
Semantic-Preserving Program Transformations [25.96895574298886]
We evaluate the generalizability of neural program models with respect to semantic-preserving transformations.
We use three Java datasets of different sizes and three state-of-the-art neural network models for code.
Our results suggest that neural program models based on data and control dependencies in programs generalize better than neural program models based only on abstract syntax trees.
arXiv Detail & Related papers (2020-07-31T20:39:20Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.