Learning to Extrapolate: A Transductive Approach
- URL: http://arxiv.org/abs/2304.14329v1
- Date: Thu, 27 Apr 2023 17:00:51 GMT
- Title: Learning to Extrapolate: A Transductive Approach
- Authors: Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit
Agrawal
- Abstract summary: We tackle the problem of developing machine learning systems that retain the power of over parameterized function approximators.
We propose a simple strategy based on bilinear embeddings to enable this type of generalization.
We instantiate a simple, practical algorithm applicable to various supervised learning and imitation learning tasks.
- Score: 44.74850954809099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning systems, especially with overparameterized deep neural
networks, can generalize to novel test instances drawn from the same
distribution as the training data. However, they fare poorly when evaluated on
out-of-support test points. In this work, we tackle the problem of developing
machine learning systems that retain the power of overparameterized function
approximators while enabling extrapolation to out-of-support test points when
possible. This is accomplished by noting that under certain conditions, a
"transductive" reparameterization can convert an out-of-support extrapolation
problem into a problem of within-support combinatorial generalization. We
propose a simple strategy based on bilinear embeddings to enable this type of
combinatorial generalization, thereby addressing the out-of-support
extrapolation problem under certain conditions. We instantiate a simple,
practical algorithm applicable to various supervised learning and imitation
learning tasks.
Related papers
- On Distributional Dependent Performance of Classical and Neural Routing Solvers [5.359176539960004]
NCO aims to learn to solve a class of problems by learning the underlying distribution of problem instances.<n>This work explores a novel approach to formulate the distribution of problem instances to learn from and, more importantly, plant a structure in the sampled problem instances.<n>We evaluate representative NCO methods and specialized Operation Research metas on this novel task and demonstrate that the performance gap between neural routing solvers and highly specialized meta-heuristics decreases when learning from sub-samples drawn from a fixed base node distribution.
arXiv Detail & Related papers (2025-08-04T15:17:08Z) - Regularization, early-stopping and dreaming: a Hopfield-like setup to
address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function.
Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z) - Learning Functional Transduction [9.926231893220063]
We show that transductive regression principles can be meta-learned through gradient descent to form efficient in-context neural approximators.
We demonstrate the benefit of our meta-learned transductive approach to model complex physical systems influenced by varying external factors with little data.
arXiv Detail & Related papers (2023-02-01T09:14:28Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Debiased Machine Learning without Sample-Splitting for Stable Estimators [21.502538698559825]
Recent work on debiased machine learning shows how one can use generic machine learning estimators for auxiliary problems.
We show that when these auxiliary estimation algorithms satisfy natural leave-one-out stability properties, then sample splitting is not required.
arXiv Detail & Related papers (2022-06-03T21:31:28Z) - Transformer for Partial Differential Equations' Operator Learning [0.0]
We present an attention-based framework for data-driven operator learning, which we term Operator Transformer (OFormer)
Our framework is built upon self-attention, cross-attention, and a set of point-wise multilayer perceptrons (MLPs)
arXiv Detail & Related papers (2022-05-26T23:17:53Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Model-Aware Regularization For Learning Approaches To Inverse Problems [11.314492463814817]
We provide an analysis of the generalisation error of deep learning methods applicable to inverse problems.
We propose a 'plug-and-play' regulariser that leverages the knowledge of the forward map to improve the generalization of the network.
We demonstrate the efficacy of our model-aware regularised deep learning algorithms against other state-of-the-art approaches.
arXiv Detail & Related papers (2020-06-18T21:59:03Z) - Total Deep Variation: A Stable Regularizer for Inverse Problems [71.90933869570914]
We introduce the data-driven general-purpose total deep variation regularizer.
In its core, a convolutional neural network extracts local features on multiple scales and in successive blocks.
We achieve state-of-the-art results for numerous imaging tasks.
arXiv Detail & Related papers (2020-06-15T21:54:15Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.