Function Encoders: A Principled Approach to Transfer Learning in Hilbert Spaces
- URL: http://arxiv.org/abs/2501.18373v1
- Date: Thu, 30 Jan 2025 14:26:23 GMT
- Title: Function Encoders: A Principled Approach to Transfer Learning in Hilbert Spaces
- Authors: Tyler Ingebrand, Adam J. Thorpe, Ufuk Topcu,
- Abstract summary: We introduce a geometric characterization of transfer in Hilbert spaces and define three types of inductive transfer.
We propose a method grounded in the theory of function encoders to achieve all three types of transfer.
Our experiments demonstrate that the function encoder outperforms state-of-the-art methods on four benchmark tasks and on all three types of transfer.
- Score: 16.849614123426772
- License:
- Abstract: A central challenge in transfer learning is designing algorithms that can quickly adapt and generalize to new tasks without retraining. Yet, the conditions of when and how algorithms can effectively transfer to new tasks is poorly characterized. We introduce a geometric characterization of transfer in Hilbert spaces and define three types of inductive transfer: interpolation within the convex hull, extrapolation to the linear span, and extrapolation outside the span. We propose a method grounded in the theory of function encoders to achieve all three types of transfer. Specifically, we introduce a novel training scheme for function encoders using least-squares optimization, prove a universal approximation theorem for function encoders, and provide a comprehensive comparison with existing approaches such as transformers and meta-learning on four diverse benchmarks. Our experiments demonstrate that the function encoder outperforms state-of-the-art methods on four benchmark tasks and on all three types of transfer.
Related papers
- State-Free Inference of State-Space Models: The Transfer Function Approach [132.83348321603205]
State-free inference does not incur any significant memory or computational cost with an increase in state size.
We achieve this using properties of the proposed frequency domain transfer function parametrization.
We report improved perplexity in language modeling over a long convolutional Hyena baseline.
arXiv Detail & Related papers (2024-05-10T00:06:02Z) - EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention [88.45459681677369]
We propose a novel transformer variant with complex vector attention, named EulerFormer.
It provides a unified theoretical framework to formulate both semantic difference and positional difference.
It is more robust to semantic variations and possesses moresuperior theoretical properties in principle.
arXiv Detail & Related papers (2024-03-26T14:18:43Z) - Invertible Fourier Neural Operators for Tackling Both Forward and
Inverse Problems [18.48295539583625]
We propose an invertible Fourier Neural Operator (iFNO) that tackles both the forward and inverse problems.
We integrated a variational auto-encoder to capture the intrinsic structures within the input space and to enable posterior inference.
The evaluations on five benchmark problems have demonstrated the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-18T22:16:43Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Transformers as Statisticians: Provable In-Context Learning with
In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL.
We show that transformers can implement a broad class of standard machine learning algorithms in context.
A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z) - Learning Functional Transduction [9.926231893220063]
We show that transductive regression principles can be meta-learned through gradient descent to form efficient in-context neural approximators.
We demonstrate the benefit of our meta-learned transductive approach to model complex physical systems influenced by varying external factors with little data.
arXiv Detail & Related papers (2023-02-01T09:14:28Z) - Quantifying and Improving Transferability in Domain Generalization [53.16289325326505]
Out-of-distribution generalization is one of the key challenges when transferring a model from the lab to the real world.
We formally define transferability that one can quantify and compute in domain generalization.
We propose a new algorithm for learning transferable features and test it over various benchmark datasets.
arXiv Detail & Related papers (2021-06-07T14:04:32Z) - Choose a Transformer: Fourier or Galerkin [0.0]
We apply the self-attention from the state-of-the-art Transformer in Attention Is All You Need to a data-driven operator learning problem.
We show that softmax normalization in the scaled dot-product attention is sufficient but not necessary, and have proved the approximation capacity of a linear variant as a Petrov-Galerkin projection.
We present three operator learning experiments, including the viscid Burgers' equation, an interface Darcy flow, and an inverse interface coefficient identification problem.
arXiv Detail & Related papers (2021-05-31T14:30:53Z) - Real-Time Topology Optimization in 3D via Deep Transfer Learning [0.0]
We introduce a transfer learning method based on a convolutional neural network.
We show it can handle high-resolution 3D design domains of various shapes and topologies.
Our experiments achieved an average binary accuracy of around 95% at real-time prediction rates.
arXiv Detail & Related papers (2021-02-11T21:09:58Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.