Related papers: FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression

FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression

URL: http://arxiv.org/abs/2510.00621v1
Date: Wed, 01 Oct 2025 07:53:55 GMT
Title: FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
Authors: Yifei Gao, Yong Chen, Chen Zhang,
Abstract summary: Functional Attention with a Mixture-of-Experts (FAME) is an end-to-end, fully data-driven framework for function-on-function regression.<n>FAME forms continuous attention by coupling a neural controlled differential equation with MoE-driven vector fields to capture intra-functional continuity.<n>Experiments on synthetic and real-world functional-regression benchmarks show that FAME achieves state-of-the-art accuracy, strong robustness to arbitrarily sampled discrete observations.
Score: 15.00767095565706
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Functional data play a pivotal role across science and engineering, yet their infinite-dimensional nature makes representation learning challenging. Conventional statistical models depend on pre-chosen basis expansions or kernels, limiting the flexibility of data-driven discovery, while many deep-learning pipelines treat functions as fixed-grid vectors, ignoring inherent continuity. In this paper, we introduce Functional Attention with a Mixture-of-Experts (FAME), an end-to-end, fully data-driven framework for function-on-function regression. FAME forms continuous attention by coupling a bidirectional neural controlled differential equation with MoE-driven vector fields to capture intra-functional continuity, and further fuses change to inter-functional dependencies via multi-head cross attention. Extensive experiments on synthetic and real-world functional-regression benchmarks show that FAME achieves state-of-the-art accuracy, strong robustness to arbitrarily sampled discrete observations of functions.

Related papers

A Doubly Robust Machine Learning Approach for Disentangling Treatment Effect Heterogeneity with Functional Outcomes [0.7646713951724009]
We introduce FOCaL (Functional Outcome Causal Learning), a doubly robust meta-learner specifically engineered to estimate a functional heterogeneous treatment effect (F-CATE)<n> FOCaL integrates advanced functional regression techniques for both outcome modeling and functional pseudo-outcome reconstruction, thereby enabling the direct and robust estimation of F-CATE.<n> FOCaL advances the capabilities of machine intelligence to infer nuanced, individualized causal effects from complex data.
arXiv Detail & Related papers (2026-02-11T18:31:59Z)
When Bayesian Tensor Completion Meets Multioutput Gaussian Processes: Functional Universality and Rank Learning [53.17227599983122]
Functional tensor decomposition can analyze multi-dimensional data with real-valued indices.<n>We propose a rank-revealing functional low-rank tensor completion (RR-F) method.<n>We establish the universal approximation property of the model for continuous multi-dimensional signals.
arXiv Detail & Related papers (2025-12-25T03:15:52Z)
Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders [3.899824115379245]
FAEclust is a novel functional autoencoder framework for cluster analysis of multi-dimensional functional data.<n>We introduce a universal-approximator encoder that captures complex nonlinear interdependencies among component functions, and a universal-approximator decoder capable of accurately reconstructing both Euclidean and manifold-valued functional data.
arXiv Detail & Related papers (2025-09-26T22:10:23Z)
Provable In-Context Learning of Nonlinear Regression with Transformers [66.99048542127768]
In-context learning (ICL) is the ability to perform unseen tasks using task specific prompts without updating parameters.<n>Recent research has actively explored the training dynamics behind ICL, with much of the focus on relatively simple tasks.<n>This paper investigates more complex nonlinear regression tasks, aiming to uncover how transformers acquire in-context learning capabilities.
arXiv Detail & Related papers (2025-07-28T00:09:28Z)
Function Forms of Simple ReLU Networks with Random Hidden Weights [1.2289361708127877]
We investigate the function space dynamics of a two-layer ReLU neural network in the infinite-width limit.<n>We highlight the Fisher information matrix's role in steering learning.<n>This work offers a robust foundation for understanding wide neural networks.
arXiv Detail & Related papers (2025-05-23T13:53:02Z)
Q-function Decomposition with Intervention Semantics with Factored Action Spaces [51.01244229483353]
We consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions.<n>This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms.
arXiv Detail & Related papers (2025-04-30T05:26:51Z)
Bayesian Kernel Regression for Functional Data [1.4501446815590895]
In supervised learning, the output variable to be predicted is often represented as a function.<n>We propose a novel functional output regression model based on kernel methods.
arXiv Detail & Related papers (2025-03-17T19:28:27Z)
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications. We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA) Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z)
Modern Non-Linear Function-on-Function Regression [8.231050911072755]
We introduce a new class of non-linear function-on-function regression models for functional data using neural networks. We give two model fitting strategies, Functional Direct Neural Network (FDNN) and Functional Basis Neural Network (FBNN)
arXiv Detail & Related papers (2021-07-29T16:19:59Z)
A New Representation of Successor Features for Transfer across Dissimilar Environments [60.813074750879615]
Many real-world RL problems require transfer among environments with different dynamics. We propose an approach based on successor features in which we model successor feature functions with Gaussian Processes. Our theoretical analysis proves the convergence of this approach as well as the bounded error on modelling successor feature functions.
arXiv Detail & Related papers (2021-07-18T12:37:05Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function. We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.