On the Generalization Properties of Learning the Random Feature Models with Learnable Activation Functions
- URL: http://arxiv.org/abs/2510.15327v1
- Date: Fri, 17 Oct 2025 05:29:10 GMT
- Title: On the Generalization Properties of Learning the Random Feature Models with Learnable Activation Functions
- Authors: Zailin Ma, Jiansheng Yang, Yaodong Yang,
- Abstract summary: We study the generalization properties of a recently proposed kernel method, the Random Feature models with Learnable Activation Functions (RFLAF)<n>By applying a data-dependent sampling scheme for generating features, we provide by far the sharpest bounds on the required number of features for learning RFLAF in both the regression and classification tasks.<n>We show that the weighted RFLAF achieves the same performances with a significantly fewer number of features compared to the plainly sampled RFLAF.
- Score: 13.5257960837278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the generalization properties of a recently proposed kernel method, the Random Feature models with Learnable Activation Functions (RFLAF). By applying a data-dependent sampling scheme for generating features, we provide by far the sharpest bounds on the required number of features for learning RFLAF in both the regression and classification tasks. We provide a unified theorem that describes the complexity of the feature number $s$, and discuss the results for the plain sampling scheme and the data-dependent leverage weighted scheme. Through weighted sampling, the bound on $s$ in the MSE loss case is improved from $\Omega(1/\epsilon^2)$ to $\tilde{\Omega}((1/\epsilon)^{1/t})$ in general $(t\geq 1)$, and even to $\Omega(1)$ when the Gram matrix has a finite rank. For the Lipschitz loss case, the bound is improved from $\Omega(1/\epsilon^2)$ to $\tilde{\Omega}((1/\epsilon^2)^{1/t})$. To learn the weighted RFLAF, we also propose an algorithm to find an approximate kernel and then apply the leverage weighted sampling. Empirical results show that the weighted RFLAF achieves the same performances with a significantly fewer number of features compared to the plainly sampled RFLAF, validating our theories and the effectiveness of this method.
Related papers
- Theoretical limits of descending $\ell_0$ sparse-regression ML algorithms [0.0]
We develop a generic analytical program for studying performance of the emphmaximum-likelihood (ML) decoding.
Key ML performance parameter, the residual emphroot mean square error ($textbfRMSE$) is uncovered to exhibit the so-called emphphase-transition (PT) phenomenon.
Concrete implementation and practical relevance of the Fl RDT typically rely on the ability to conduct a sizeable set of the underlying numerical evaluations.
arXiv Detail & Related papers (2024-10-10T06:33:41Z) - A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Model-adapted Fourier sampling for generative compressed sensing [7.130302992490975]
We study generative compressed sensing when the measurement matrix is randomly subsampled from a unitary matrix.
We construct a model-adapted sampling strategy with an improved sample complexity of $textitO(kd| boldsymbolalpha|_22)$ measurements.
arXiv Detail & Related papers (2023-10-08T03:13:16Z) - Improved Sample Complexity for Reward-free Reinforcement Learning under
Low-rank MDPs [43.53286390357673]
This paper focuses on reward-free reinforcement learning under low-rank MDP models.
We first provide the first known sample complexity lower bound for any algorithm under low-rank MDPs.
We then propose a novel model-based algorithm, coined RAFFLE, and show it can both find an $epsilon$-optimal policy and achieve an $epsilon$-accurate system identification.
arXiv Detail & Related papers (2023-03-20T04:39:39Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
Reinforcement Learning [52.76230802067506]
A novel model-free algorithm is proposed to minimize regret in episodic reinforcement learning.
The proposed algorithm employs an em early-settled reference update rule, with the aid of two Q-learning sequences.
The design principle of our early-settled variance reduction method might be of independent interest to other RL settings.
arXiv Detail & Related papers (2021-10-09T21:13:48Z) - Sharp Analysis of Random Fourier Features in Classification [9.383533125404755]
We show for the first time that random Fourier features classification can achieve $O(sqrtn)$ learning rate with only $Omega(sqrtn log n)$ features.
arXiv Detail & Related papers (2021-09-22T09:49:27Z) - Towards Sample-Optimal Compressive Phase Retrieval with Sparse and
Generative Priors [59.33977545294148]
We show that $O(k log L)$ samples suffice to guarantee that the signal is close to any vector that minimizes an amplitude-based empirical loss function.
We adapt this result to sparse phase retrieval, and show that $O(s log n)$ samples are sufficient for a similar guarantee when the underlying signal is $s$-sparse and $n$-dimensional.
arXiv Detail & Related papers (2021-06-29T12:49:54Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - Tightening the Dependence on Horizon in the Sample Complexity of
Q-Learning [59.71676469100807]
This work sharpens the sample complexity of synchronous Q-learning to an order of $frac|mathcalS|| (1-gamma)4varepsilon2$ for any $0varepsilon 1$.
Our finding unveils the effectiveness of vanilla Q-learning, which matches that of speedy Q-learning without requiring extra computation and storage.
arXiv Detail & Related papers (2021-02-12T14:22:05Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.