Functional Indirection Neural Estimator for Better Out-of-distribution
Generalization
- URL: http://arxiv.org/abs/2210.12739v1
- Date: Sun, 23 Oct 2022 14:43:02 GMT
- Title: Functional Indirection Neural Estimator for Better Out-of-distribution
Generalization
- Authors: Kha Pham, Hung Le, Man Ngo, and Truyen Tran
- Abstract summary: FINE (Functional Indirection Neural Estorimator) learns to compose functions that map data input to output on-the-fly.
We train FINE and competing models on IQ tasks using images from the MNIST, Omniglot and CIFAR100 datasets.
FINE not only achieves the best performance on all tasks but also is able to adapt to small-scale data scenarios.
- Score: 27.291114360472243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The capacity to achieve out-of-distribution (OOD) generalization is a
hallmark of human intelligence and yet remains out of reach for machines. This
remarkable capability has been attributed to our abilities to make conceptual
abstraction and analogy, and to a mechanism known as indirection, which binds
two representations and uses one representation to refer to the other. Inspired
by these mechanisms, we hypothesize that OOD generalization may be achieved by
performing analogy-making and indirection in the functional space instead of
the data space as in current methods. To realize this, we design FINE
(Functional Indirection Neural Estimator), a neural framework that learns to
compose functions that map data input to output on-the-fly. FINE consists of a
backbone network and a trainable semantic memory of basis weight matrices. Upon
seeing a new input-output data pair, FINE dynamically constructs the backbone
weights by mixing the basis weights. The mixing coefficients are indirectly
computed through querying a separate corresponding semantic memory using the
data pair. We demonstrate empirically that FINE can strongly improve
out-of-distribution generalization on IQ tasks that involve geometric
transformations. In particular, we train FINE and competing models on IQ tasks
using images from the MNIST, Omniglot and CIFAR100 datasets and test on tasks
with unseen image classes from one or different datasets and unseen
transformation rules. FINE not only achieves the best performance on all tasks
but also is able to adapt to small-scale data scenarios.
Related papers
- MaD-Scientist: AI-based Scientist solving Convection-Diffusion-Reaction Equations Using Massive PINN-Based Prior Data [22.262191225577244]
We explore whether a similar approach can be applied to scientific foundation models (SFMs)
We collect low-cost physics-informed neural network (PINN)-based approximated prior data in the form of solutions to partial differential equations (PDEs) constructed through an arbitrary linear combination of mathematical dictionaries.
We provide experimental evidence on the one-dimensional convection-diffusion-reaction equation, which demonstrate that pre-training remains robust even with approximated prior data.
arXiv Detail & Related papers (2024-10-09T00:52:00Z) - Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - SPINE: Soft Piecewise Interpretable Neural Equations [0.0]
Fully connected networks are ubiquitous but uninterpretable.
This paper takes a novel approach to piecewise fits by using set operations on individual pieces(parts)
It can find a variety of applications where fully connected layers must be replaced by interpretable layers.
arXiv Detail & Related papers (2021-11-20T16:18:00Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - An Insect-Inspired Randomly, Weighted Neural Network with Random Fourier
Features For Neuro-Symbolic Relational Learning [2.28438857884398]
We propose a Randomly Weighted Feature Network that incorporates randomly drawn, untrained weights in an encoder that uses an adapted linear model as a decoder.
Because of this special representation, RWFNs can effectively learn the degree of relationship among inputs by training only a linear decoder model.
We demonstrate that compared to LTNs, RWFNs can achieve better or similar performance for both object classification and detection of the part-of relations between objects in SII tasks.
arXiv Detail & Related papers (2021-09-11T22:45:08Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.