Related papers: Capacity dependent analysis for functional online learning algorithms

Capacity dependent analysis for functional online learning algorithms

URL: http://arxiv.org/abs/2209.12198v1
Date: Sun, 25 Sep 2022 11:21:18 GMT
Title: Capacity dependent analysis for functional online learning algorithms
Authors: Xin Guo, Zheng-Chu Guo, Lei Shi
Abstract summary: This article provides convergence analysis of online gradient descent algorithms for functional linear models. We show that capacity assumption can alleviate the saturation of the convergence rate as the regularity of the target function increases.
Score: 8.748563565641279
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This article provides convergence analysis of online stochastic gradient descent algorithms for functional linear models. Adopting the characterizations of the slope function regularity, the kernel space capacity, and the capacity of the sampling process covariance operator, significant improvement on the convergence rates is achieved. Both prediction problems and estimation problems are studied, where we show that capacity assumption can alleviate the saturation of the convergence rate as the regularity of the target function increases. We show that with properly selected kernel, capacity assumptions can fully compensate for the regularity assumptions for prediction problems (but not for estimation problems). This demonstrates the significant difference between the prediction problems and the estimation problems in functional data analysis.

Related papers

Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution. Our contribution is to provide the first general estimation technique for transportability problems. We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z)
Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation [0.7673339435080445]
This article addresses the challenge of parameter calibration in models where the likelihood function is not analytically available. We propose a gradient-based simulated parameter estimation framework, leveraging a multi-time scale that tackles the issue of ratio bias in both maximum likelihood estimation and posterior density estimation problems.
arXiv Detail & Related papers (2024-11-20T02:46:15Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Distributed Learning with Discretely Observed Functional Data [1.4583059436979549]
This paper combines distributed spectral algorithms with Sobolev kernels to tackle the functional linear regression problem. The hypothesis function spaces of the algorithms are the Sobolev spaces generated by the Sobolev kernels. We derive matching upper and lower bounds for the convergence of the distributed spectral algorithms in the Sobolev norm.
arXiv Detail & Related papers (2024-10-03T10:49:34Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Distributed Stochastic Optimization under a General Variance Condition [13.911633636387059]
Distributed optimization has drawn great attention recently due to its effectiveness in solving largescale machine learning problems. We revisit the classical Federated Averaging (Avg) and establish the convergence results under only a mild variance for smooth non objective functions. Almost a stationary convergence point is also established under the gradients condition.
arXiv Detail & Related papers (2023-01-30T05:48:09Z)
Online Regularized Learning Algorithm for Functional Data [2.5382095320488673]
This paper considers online regularized learning algorithm in Hilbert kernel spaces. It shows that convergence rates of both prediction error and estimation error with constant step-size are competitive with those in the literature.
arXiv Detail & Related papers (2022-11-24T11:56:10Z)
Statistical Optimality of Divide and Conquer Kernel-based Functional Linear Regression [1.7227952883644062]
This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not reside in the underlying kernel space. As a decomposition-based scalable approach, the divide-and-conquer estimators of functional linear regression can substantially reduce the algorithmic complexities in time and memory.
arXiv Detail & Related papers (2022-11-20T12:29:06Z)
Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing. We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z)
Experimental Design for Linear Functionals in Reproducing Kernel Hilbert Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals. We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples. We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.