Related papers: The Galerkin method beats Graph-Based Approaches for Spectral Algorithms

The Galerkin method beats Graph-Based Approaches for Spectral Algorithms

URL: http://arxiv.org/abs/2306.00742v3
Date: Mon, 26 Feb 2024 09:02:54 GMT
Title: The Galerkin method beats Graph-Based Approaches for Spectral Algorithms
Authors: Vivien Cabannes, Francis Bach
Abstract summary: We break with the machine learning community and prove the statistical and computational superiority of the Galerkin method. We introduce implementation tricks to deal with differential operators in large dimensions with structured kernels. We extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks.
Score: 3.5897534810405403
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Historically, the machine learning community has derived spectral decompositions from graph-based approaches. We break with this approach and prove the statistical and computational superiority of the Galerkin method, which consists in restricting the study to a small set of test functions. In particular, we introduce implementation tricks to deal with differential operators in large dimensions with structured kernels. Finally, we extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks, through loss-based optimization procedures.

Related papers

Random feature approximation for general spectral methods [2.9388890036358104]
This work extends previous results for Tikhonov regularization to a broad class of spectral regularization techniques.<n>We enable a theoretical analysis of neural networks and neural operators through the lens of the Neural Tangent Kernel (NTK) approach.
arXiv Detail & Related papers (2025-06-19T13:00:17Z)
Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective. We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices. Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z)
Random feature approximation for general spectral methods [0.0]
We analyze generalization properties for a large class of spectral regularization methods combined with random features. For our estimators we obtain optimal learning rates over gradient regularity classes.
arXiv Detail & Related papers (2023-08-29T16:56:03Z)
Multivariate Systemic Risk Measures and Computation by Deep Learning Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations. The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z)
Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data. We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel. We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z)
NeuralEF: Deconstructing Kernels by Deep Neural Networks [47.54733625351363]
Traditional nonparametric solutions based on the Nystr"om formula suffer from scalability issues. Recent work has resorted to a parametric approach, i.e., training neural networks to approximate the eigenfunctions. We show that these problems can be fixed by using a new series of objective functions that generalizes to space of supervised and unsupervised learning problems.
arXiv Detail & Related papers (2022-04-30T05:31:07Z)
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency [111.83670279016599]
We study reinforcement learning for partially observed decision processes (POMDPs) with infinite observation and state spaces. We make the first attempt at partial observability and function approximation for a class of POMDPs with a linear structure.
arXiv Detail & Related papers (2022-04-20T21:15:38Z)
On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods. We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Galerkin Neural Networks: A Framework for Approximating Variational Equations with Error Control [0.0]
We present a new approach to using neural networks to approximate the solutions of variational equations. We use a sequence of finite-dimensional subspaces whose basis functions are realizations of a sequence of neural networks.
arXiv Detail & Related papers (2021-05-28T20:25:40Z)
Intrinsic Gaussian Processes on Manifolds and Their Accelerations by Symmetry [9.773237080061815]
Existing methods primarily focus on low dimensional constrained domains for heat kernel estimation. Our research proposes an intrinsic approach for constructing GP on general equations. Our methodology estimates the heat kernel by simulating Brownian motion sample paths using the exponential map.
arXiv Detail & Related papers (2020-06-25T09:17:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.