Semi-Parametric Inducing Point Networks and Neural Processes
- URL: http://arxiv.org/abs/2205.11718v2
- Date: Thu, 30 Mar 2023 06:13:59 GMT
- Title: Semi-Parametric Inducing Point Networks and Neural Processes
- Authors: Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian
Deng, Mert R. Sabuncu, Volodymyr Kuleshov
- Abstract summary: Semi-parametric inducing point networks (SPIN) can query the training set at inference time in a compute-efficient manner.
SPIN attains linear complexity via a cross-attention mechanism between datapoints inspired by inducing point methods.
In our experiments, SPIN reduces memory requirements, improves accuracy across a range of meta-learning tasks, and improves state-of-the-art performance on an important practical problem, genotype imputation.
- Score: 15.948270454686197
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We introduce semi-parametric inducing point networks (SPIN), a
general-purpose architecture that can query the training set at inference time
in a compute-efficient manner. Semi-parametric architectures are typically more
compact than parametric models, but their computational complexity is often
quadratic. In contrast, SPIN attains linear complexity via a cross-attention
mechanism between datapoints inspired by inducing point methods. Querying large
training sets can be particularly useful in meta-learning, as it unlocks
additional training signal, but often exceeds the scaling limits of existing
models. We use SPIN as the basis of the Inducing Point Neural Process, a
probabilistic model which supports large contexts in meta-learning and achieves
high accuracy where existing models fail. In our experiments, SPIN reduces
memory requirements, improves accuracy across a range of meta-learning tasks,
and improves state-of-the-art performance on an important practical problem,
genotype imputation.
Related papers
- Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Slow Invariant Manifolds of Singularly Perturbed Systems via
Physics-Informed Machine Learning [0.0]
We present a physics-informed machine-learning (PIML) approach for the approximation of slow invariant manifold (SIMs) of singularly perturbed systems.
We show that the proposed PIML scheme provides approximations, of equivalent or even higher accuracy, than those provided by other traditional GSPT-based methods.
A comparison of the computational costs between symbolic, automatic and numerical approximation of the required derivatives in the learning process is also provided.
arXiv Detail & Related papers (2023-09-14T14:10:22Z) - Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data.
We find that neural networks can often outperform classical approximation methods on high-dimensional examples.
We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - Balancing Accuracy and Latency in Multipath Neural Networks [0.09668407688201358]
We use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of neural networks.
We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.
arXiv Detail & Related papers (2021-04-25T00:05:48Z) - Neural Complexity Measures [96.06344259626127]
We propose Neural Complexity (NC), a meta-learning framework for predicting generalization.
Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way.
arXiv Detail & Related papers (2020-08-07T02:12:10Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - A deep learning framework for solution and discovery in solid mechanics [1.4699455652461721]
We present the application of a class of deep learning, known as Physics Informed Neural Networks (PINN), to learning and discovery in solid mechanics.
We explain how to incorporate the momentum balance and elasticity relations into PINN, and explore in detail the application to linear elasticity.
arXiv Detail & Related papers (2020-02-14T08:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.