Fractional ridge regression: a fast, interpretable reparameterization of
ridge regression
- URL: http://arxiv.org/abs/2005.03220v1
- Date: Thu, 7 May 2020 03:12:23 GMT
- Title: Fractional ridge regression: a fast, interpretable reparameterization of
ridge regression
- Authors: Ariel Rokem, Kendrick Kay
- Abstract summary: Ridge regression (RR) is a regularization technique that penalizes the L2-norm of the coefficients in linear regression.
We provide an algorithm to solve FRR, as well as open-source software implementations in Python.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ridge regression (RR) is a regularization technique that penalizes the
L2-norm of the coefficients in linear regression. One of the challenges of
using RR is the need to set a hyperparameter ($\alpha$) that controls the
amount of regularization. Cross-validation is typically used to select the best
$\alpha$ from a set of candidates. However, efficient and appropriate selection
of $\alpha$ can be challenging, particularly where large amounts of data are
analyzed. Because the selected $\alpha$ depends on the scale of the data and
predictors, it is not straightforwardly interpretable. Here, we propose to
reparameterize RR in terms of the ratio $\gamma$ between the L2-norms of the
regularized and unregularized coefficients. This approach, called fractional RR
(FRR), has several benefits: the solutions obtained for different $\gamma$ are
guaranteed to vary, guarding against wasted calculations, and automatically
span the relevant range of regularization, avoiding the need for arduous manual
exploration. We provide an algorithm to solve FRR, as well as open-source
software implementations in Python and MATLAB
(https://github.com/nrdg/fracridge). We show that the proposed method is fast
and scalable for large-scale data problems, and delivers results that are
straightforward to interpret and compare across models and datasets.
Related papers
- Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives.
Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion.
We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z) - LFFR: Logistic Function For (single-output) Regression [0.0]
We implement privacy-preserving regression training using data encrypted under a fully homomorphic encryption scheme.
We develop a novel and efficient algorithm called LFFR for homomorphic regression using the logistic function.
arXiv Detail & Related papers (2024-07-13T17:33:49Z) - Robust Reinforcement Learning from Corrupted Human Feedback [86.17030012828003]
Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data.
We propose a robust RLHF approach -- $R3M$, which models the potentially corrupted preference label as sparse outliers.
Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R3M$ improves robustness of the reward against several types of perturbations to the preference data.
arXiv Detail & Related papers (2024-06-21T18:06:30Z) - Scaling Laws in Linear Regression: Compute, Parameters, and Data [86.48154162485712]
We study the theory of scaling laws in an infinite dimensional linear regression setup.
We show that the reducible part of the test error is $Theta(-(a-1) + N-(a-1)/a)$.
Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
arXiv Detail & Related papers (2024-06-12T17:53:29Z) - Bayes beats Cross Validation: Efficient and Accurate Ridge Regression
via Expectation Maximization [3.061662434597098]
We present a method for the regularization hyper- parameter, $lambda$, that is faster to compute than leave-one-out cross-validation (LOOCV)
We show that the proposed method is guaranteed to find a unique optimal solution for large enough $n$, under relatively mild conditions.
arXiv Detail & Related papers (2023-10-29T01:13:55Z) - Hardness and Algorithms for Robust and Sparse Optimization [17.842787715567436]
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression.
Specifically, the sparse linear regression problem seeks a $k$-sparse vector $xinmathbbRd$ to minimize $|Ax-b|$.
The robust linear regression problem seeks a set $S$ that ignores at most $k$ rows and a vector $x$ to minimize $|(Ax-b)_S|$.
arXiv Detail & Related papers (2022-06-29T01:40:38Z) - Efficient and robust high-dimensional sparse logistic regression via
nonlinear primal-dual hybrid gradient algorithms [0.0]
We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty.
This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
arXiv Detail & Related papers (2021-11-30T14:16:48Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - Conditional Uncorrelation and Efficient Non-approximate Subset Selection
in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation.
By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.