Improving Compositionality of Neural Networks by Decoding
Representations to Inputs
- URL: http://arxiv.org/abs/2106.00769v1
- Date: Tue, 1 Jun 2021 20:07:16 GMT
- Title: Improving Compositionality of Neural Networks by Decoding
Representations to Inputs
- Authors: Mike Wu, Noah Goodman, Stefano Ermon
- Abstract summary: We bridge the benefits of traditional and deep learning programs by jointly training a generative model to constrain neural network activations to "decode" back to inputs.
We demonstrate applications of decodable representations to out-of-distribution detection, adversarial examples, calibration, and fairness.
- Score: 83.97012077202882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In traditional software programs, we take for granted how easy it is to debug
code by tracing program logic from variables back to input, apply unit tests
and assertion statements to block erroneous behavior, and compose programs
together. But as the programs we write grow more complex, it becomes hard to
apply traditional software to applications like computer vision or natural
language. Although deep learning programs have demonstrated strong performance
on these applications, they sacrifice many of the functionalities of
traditional software programs. In this paper, we work towards bridging the
benefits of traditional and deep learning programs by jointly training a
generative model to constrain neural network activations to "decode" back to
inputs. Doing so enables practitioners to probe and track information encoded
in activation(s), apply assertion-like constraints on what information is
encoded in an activation, and compose separate neural networks together in a
plug-and-play fashion. In our experiments, we demonstrate applications of
decodable representations to out-of-distribution detection, adversarial
examples, calibration, and fairness -- while matching standard neural networks
in accuracy.
Related papers
- COOL: A Constraint Object-Oriented Logic Programming Language and its
Neural-Symbolic Compilation System [0.0]
We introduce the COOL programming language, which seamlessly combines logical reasoning with neural network technologies.
COOL is engineered to autonomously handle data collection, mitigating the need for user-supplied initial data.
It incorporates user prompts into the coding process to reduce the risks of undertraining and enhances the interaction among models throughout their lifecycle.
arXiv Detail & Related papers (2023-11-07T06:29:59Z) - Enhancing Network Management Using Code Generated by Large Language
Models [15.557254786007325]
We introduce a novel approach to facilitate a natural-language-based network management experience, utilizing large language models (LLMs) to generate task-specific code from natural language queries.
This method tackles the challenges of explainability, scalability, and privacy by allowing network operators to inspect the generated code.
We design and evaluate a prototype system using benchmark applications, showcasing high accuracy, cost-effectiveness, and the potential for further enhancements.
arXiv Detail & Related papers (2023-08-11T17:49:15Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - Predictive Coding: Towards a Future of Deep Learning beyond
Backpropagation? [41.58529335439799]
The backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning.
Recent work has developed the idea into a general-purpose algorithm able to train neural networks using only local computations.
We show the substantially greater flexibility of predictive coding networks against equivalent deep neural networks.
arXiv Detail & Related papers (2022-02-18T22:57:03Z) - Adversarial Neural Networks for Error Correcting Codes [76.70040964453638]
We introduce a general framework to boost the performance and applicability of machine learning (ML) models.
We propose to combine ML decoders with a competing discriminator network that tries to distinguish between codewords and noisy words.
Our framework is game-theoretic, motivated by generative adversarial networks (GANs)
arXiv Detail & Related papers (2021-12-21T19:14:44Z) - Unsupervised Learning of Neurosymbolic Encoders [40.3575054882791]
We present a framework for the unsupervised learning of neurosymbolic encoders, i.e., encoders obtained by composing neural networks with symbolic programs from a domain-specific language.
Such a framework can naturally incorporate symbolic expert knowledge into the learning process and lead to more interpretable and factorized latent representations than fully neural encoders.
arXiv Detail & Related papers (2021-07-28T02:16:14Z) - Enforcing Consistency in Weakly Supervised Semantic Parsing [68.2211621631765]
We explore the use of consistency between the output programs for related inputs to reduce the impact of spurious programs.
We find that a more consistent formalism leads to improved model performance even without consistency-based training.
arXiv Detail & Related papers (2021-07-13T03:48:04Z) - Neurocoder: Learning General-Purpose Computation Using Stored Neural
Programs [64.56890245622822]
Neurocoder is an entirely new class of general-purpose conditional computational machines.
It "codes" itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs.
We show new capacity to learn modular programs, handle severe pattern shifts and remember old programs as new ones are learnt.
arXiv Detail & Related papers (2020-09-24T01:39:16Z) - Synthetic Datasets for Neural Program Synthesis [66.20924952964117]
We propose a new methodology for controlling and evaluating the bias of synthetic data distributions over both programs and specifications.
We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
arXiv Detail & Related papers (2019-12-27T21:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.