Related papers: Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

URL: http://arxiv.org/abs/2005.03220v1
Date: Thu, 7 May 2020 03:12:23 GMT
Title: Fractional ridge regression: a fast, interpretable reparameterization of ridge regression
Authors: Ariel Rokem, Kendrick Kay
Abstract summary: Ridge regression (RR) is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. We provide an algorithm to solve FRR, as well as open-source software implementations in Python.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ridge regression (RR) is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using RR is the need to set a hyperparameter ($\alpha$) that controls the amount of regularization. Cross-validation is typically used to select the best $\alpha$ from a set of candidates. However, efficient and appropriate selection of $\alpha$ can be challenging, particularly where large amounts of data are analyzed. Because the selected $\alpha$ depends on the scale of the data and predictors, it is not straightforwardly interpretable. Here, we propose to reparameterize RR in terms of the ratio $\gamma$ between the L2-norms of the regularized and unregularized coefficients. This approach, called fractional RR (FRR), has several benefits: the solutions obtained for different $\gamma$ are guaranteed to vary, guarding against wasted calculations, and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. We provide an algorithm to solve FRR, as well as open-source software implementations in Python and MATLAB (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems, and delivers results that are straightforward to interpret and compare across models and datasets.

Related papers

Test Set Sizing for the Ridge Regression [55.2480439325792]
This is the first time that such a split is calculated mathematically for a machine learning model in the large data limit. The goal of the calculations is to maximize "integrity," so that the measured error in the trained model is as close as possible to what it theoretically should be.
arXiv Detail & Related papers (2025-04-27T13:17:18Z)
Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives. Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion. We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z)
LFFR: Logistic Function For (single-output) Regression [0.0]
We implement privacy-preserving regression training using data encrypted under a fully homomorphic encryption scheme. We develop a novel and efficient algorithm called LFFR for homomorphic regression using the logistic function.
arXiv Detail & Related papers (2024-07-13T17:33:49Z)
Robust Reinforcement Learning from Corrupted Human Feedback [86.17030012828003]
Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data. We propose a robust RLHF approach -- $R3M$, which models the potentially corrupted preference label as sparse outliers. Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R3M$ improves robustness of the reward against several types of perturbations to the preference data.
arXiv Detail & Related papers (2024-06-21T18:06:30Z)
Scaling Laws in Linear Regression: Compute, Parameters, and Data [86.48154162485712]
We study the theory of scaling laws in an infinite dimensional linear regression setup. We show that the reducible part of the test error is $Theta(-(a-1) + N-(a-1)/a)$. Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
arXiv Detail & Related papers (2024-06-12T17:53:29Z)
Bayes beats Cross Validation: Efficient and Accurate Ridge Regression via Expectation Maximization [3.061662434597098]
We present a method for the regularization hyper- parameter, $lambda$, that is faster to compute than leave-one-out cross-validation (LOOCV) We show that the proposed method is guaranteed to find a unique optimal solution for large enough $n$, under relatively mild conditions.
arXiv Detail & Related papers (2023-10-29T01:13:55Z)
Hardness and Algorithms for Robust and Sparse Optimization [17.842787715567436]
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression. Specifically, the sparse linear regression problem seeks a $k$-sparse vector $xinmathbbRd$ to minimize $|Ax-b|$. The robust linear regression problem seeks a set $S$ that ignores at most $k$ rows and a vector $x$ to minimize $|(Ax-b)_S|$.
arXiv Detail & Related papers (2022-06-29T01:40:38Z)
Efficient and robust high-dimensional sparse logistic regression via nonlinear primal-dual hybrid gradient algorithms [0.0]
We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty. This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
arXiv Detail & Related papers (2021-11-30T14:16:48Z)
Robust Kernel-based Distribution Regression [13.426195476348955]
We study distribution regression (DR) which involves two stages of sampling, and aims at regressing from probability measures to real-valued responses over a kernel reproducing Hilbert space (RKHS) By introducing a robust loss function $l_sigma$ for two-stage sampling problems, we present a novel robust distribution regression (RDR) scheme.
arXiv Detail & Related papers (2021-04-21T17:03:46Z)
A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available. Most existing methods are only applicable when the sample size is small. We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z)
Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation. By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.