Optimal Activation Functions for the Random Features Regression Model
- URL: http://arxiv.org/abs/2206.01332v3
- Date: Fri, 24 Mar 2023 18:42:09 GMT
- Title: Optimal Activation Functions for the Random Features Regression Model
- Authors: Jianxin Wang and Jos\'e Bento
- Abstract summary: We identify in closed-form the family of Activation Functions (AFs) that minimize a combination of the test error and sensitivity of the Random Features Regression model.
We show how using optimal AFs impacts well-established properties of the RFR model.
- Score: 7.381113319198103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The asymptotic mean squared test error and sensitivity of the Random Features
Regression model (RFR) have been recently studied. We build on this work and
identify in closed-form the family of Activation Functions (AFs) that minimize
a combination of the test error and sensitivity of the RFR under different
notions of functional parsimony. We find scenarios under which the optimal AFs
are linear, saturated linear functions, or expressible in terms of Hermite
polynomials. Finally, we show how using optimal AFs impacts well-established
properties of the RFR model, such as its double descent curve, and the
dependency of its optimal regularization parameter on the observation noise
level.
Related papers
- FuncGenFoil: Airfoil Generation and Editing Model in Function Space [63.274584650021744]
We introduce FuncGenFoil, a novel function-space generative model that directly reconstructs airfoil geometries as function curves.<n> Empirical evaluations demonstrate that FuncGenFoil improves upon state-of-the-art methods in airfoil generation.
arXiv Detail & Related papers (2025-02-15T07:56:58Z) - Learning Expressive Random Feature Models via Parametrized Activations [10.908603300691064]
We introduce the Random Feature Model with Learnable Activation Functions (RFLAF)<n>RFLAF enhances the model expressivity by parameterizing activation functions as weighted sums of basis functions.<n>Our work provides a deeper understanding of the component of learnable activation functions within modern neural networks architectures.
arXiv Detail & Related papers (2024-11-29T04:38:12Z) - Functional Partial Least-Squares: Adaptive Estimation and Inference [0.0]
We show that the functional partial least squares (PLS) estimator attains nearly minimax-optimal convergence rates over a class of ellipsoids.<n>We apply our methodology to evaluate the nonlinear effects of temperature on corn and soybean yields.
arXiv Detail & Related papers (2024-02-16T23:47:47Z) - Optimal Nonlinearities Improve Generalization Performance of Random
Features [0.9790236766474201]
Random feature model with a nonlinear activation function has been shown to performally equivalent to a Gaussian model in terms of training and generalization errors.
We show that acquired parameters from the Gaussian model enable us to define a set of optimal nonlinearities.
Our numerical results validate that the optimized nonlinearities achieve better generalization performance than widely-used nonlinear functions such as ReLU.
arXiv Detail & Related papers (2023-09-28T20:55:21Z) - Nonparametric estimation of a covariate-adjusted counterfactual
treatment regimen response curve [2.7446241148152253]
Flexible estimation of the mean outcome under a treatment regimen is a key step toward personalized medicine.
We propose an inverse probability weighted nonparametrically efficient estimator of the smoothed regimen-response curve function.
Some finite-sample properties are explored with simulations.
arXiv Detail & Related papers (2023-09-28T01:46:24Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - On High dimensional Poisson models with measurement error: hypothesis
testing for nonlinear nonconvex optimization [13.369004892264146]
We estimation and testing regression model with high dimensionals, which has wide applications in analyzing data.
We propose to estimate regression parameter through minimizing penalized consistency.
The proposed method is applied to the Alzheimer's Disease Initiative.
arXiv Detail & Related papers (2022-12-31T06:58:42Z) - Adaptive LASSO estimation for functional hidden dynamic geostatistical
model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD)
The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Stochastic Optimization of Areas Under Precision-Recall Curves with
Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems.
We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z) - Support estimation in high-dimensional heteroscedastic mean regression [2.28438857884398]
We consider a linear mean regression model with random design and potentially heteroscedastic, heavy-tailed errors.
We use a strictly convex, smooth variant of the Huber loss function with tuning parameter depending on the parameters of the problem.
For the resulting estimator we show sign-consistency and optimal rates of convergence in the $ell_infty$ norm.
arXiv Detail & Related papers (2020-11-03T09:46:31Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.