DeepOSets: Non-Autoregressive In-Context Learning of Supervised Learning Operators
- URL: http://arxiv.org/abs/2410.09298v3
- Date: Mon, 03 Feb 2025 21:24:30 GMT
- Title: DeepOSets: Non-Autoregressive In-Context Learning of Supervised Learning Operators
- Authors: Shao-Ting Chiu, Junyuan Hong, Ulisses Braga-Neto,
- Abstract summary: We introduce DeepSets Operator Networks (DeepOSets), an efficient, non-autoregressive neural network architecture for in-context learning of permutation-invariant operators.
DeepOSets combines the operator learning capabilities of Deep Operator Networks (DeepONets) with the set learning capabilities of DeepSets.
- Score: 11.913853433712855
- License:
- Abstract: We introduce DeepSets Operator Networks (DeepOSets), an efficient, non-autoregressive neural network architecture for in-context learning of permutation-invariant operators. DeepOSets combines the operator learning capabilities of Deep Operator Networks (DeepONets) with the set learning capabilities of DeepSets. Here, we present the application of DeepOSets to the problem of learning supervised learning algorithms, which are continuous permutation-invariant operators. We show that DeepOSets are universal approximators for this class of operators. In an empirical comparison with a popular autoregressive (transformer-based) model for in-context learning of linear regression, DeepOSets reduced the number of model weights by several orders of magnitude and required a fraction of training and inference time, in addition to significantly outperforming the transformer model in noisy settings. We also demonstrate the multiple operator learning capabilities of DeepOSets with a polynomial regression experiment where the order of the polynomial is learned in-context from the prompt.
Related papers
- A Library for Learning Neural Operators [77.16483961863808]
We present NeuralOperator, an open-source Python library for operator learning.
Neural operators generalize neural networks to maps between function spaces instead of finite-dimensional Euclidean spaces.
Built on top of PyTorch, NeuralOperator provides all the tools for training and deploying neural operator models.
arXiv Detail & Related papers (2024-12-13T18:49:37Z) - On the training and generalization of deep operator networks [11.159056906971983]
We present a novel training method for deep operator networks (DeepONets)
DeepONets are constructed by two sub-networks.
We establish the width error estimate in terms of input data.
arXiv Detail & Related papers (2023-09-02T21:10:45Z) - Transfer Learning Enhanced DeepONet for Long-Time Prediction of
Evolution Equations [9.748550197032785]
Deep operator network (DeepONet) has demonstrated great success in various learning tasks.
This paper proposes a em transfer-learning aided DeepONet to enhance the stability.
arXiv Detail & Related papers (2022-12-09T04:37:08Z) - Multifidelity deep neural operators for efficient learning of partial
differential equations with application to fast inverse design of nanoscale
heat transport [2.512625172084287]
We develop a multifidelity neural operator based on a deep operator network (DeepONet)
A multifidelity DeepONet significantly reduces the required amount of high-fidelity data and achieves one order of magnitude smaller error when using the same amount of high-fidelity data.
We apply a multifidelity DeepONet to learn the phonon Boltzmann transport equation (BTE), a framework to compute nanoscale heat transport.
arXiv Detail & Related papers (2022-04-14T01:01:24Z) - MultiAuto-DeepONet: A Multi-resolution Autoencoder DeepONet for
Nonlinear Dimension Reduction, Uncertainty Quantification and Operator
Learning of Forward and Inverse Stochastic Problems [12.826754199680474]
A new data-driven method for operator learning of differential equations(SDE) is proposed in this paper.
The central goal is to solve forward and inverse problems more effectively using limited data.
arXiv Detail & Related papers (2022-04-07T03:53:49Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Improved architectures and training algorithms for deep operator
networks [0.0]
Operator learning techniques have emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces.
We analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory.
arXiv Detail & Related papers (2021-10-04T18:34:41Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.