Interpretation of High-Dimensional Regression Coefficients by Comparison with Linearized Compressing Features
- URL: http://arxiv.org/abs/2411.12060v1
- Date: Mon, 18 Nov 2024 20:59:38 GMT
- Title: Interpretation of High-Dimensional Regression Coefficients by Comparison with Linearized Compressing Features
- Authors: Joachim Schaeffer, Jinwook Rhyu, Robin Droop, Rolf Findeisen, Richard Braatz,
- Abstract summary: We focus on understanding how linear regression approximates nonlinear responses from high-dimensional functional data, motivated by predicting cycle life for lithium-ion batteries.
We develop a linearization method to derive feature coefficients, which we compare with the closest regression coefficients of the path of regression solutions.
- Score: 0.0
- License:
- Abstract: Linear regression is often deemed inherently interpretable; however, challenges arise for high-dimensional data. We focus on further understanding how linear regression approximates nonlinear responses from high-dimensional functional data, motivated by predicting cycle life for lithium-ion batteries. We develop a linearization method to derive feature coefficients, which we compare with the closest regression coefficients of the path of regression solutions. We showcase the methods on battery data case studies where a single nonlinear compressing feature, $g\colon \mathbb{R}^p \to \mathbb{R}$, is used to construct a synthetic response, $\mathbf{y} \in \mathbb{R}$. This unifying view of linear regression and compressing features for high-dimensional functional data helps to understand (1) how regression coefficients are shaped in the highly regularized domain and how they relate to linearized feature coefficients and (2) how the shape of regression coefficients changes as a function of regularization to approximate nonlinear responses by exploiting local structures.
Related papers
- Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives.
Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion.
We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z) - Induced Covariance for Causal Discovery in Linear Sparse Structures [55.2480439325792]
Causal models seek to unravel the cause-effect relationships among variables from observed data.
This paper introduces a novel causal discovery algorithm designed for settings in which variables exhibit linearly sparse relationships.
arXiv Detail & Related papers (2024-10-02T04:01:38Z) - A Structure-Preserving Kernel Method for Learning Hamiltonian Systems [3.594638299627404]
A structure-preserving kernel ridge regression method is presented that allows the recovery of potentially high-dimensional and nonlinear Hamiltonian functions.
The paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required.
A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters.
arXiv Detail & Related papers (2024-03-15T07:20:21Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Interpretation of High-Dimensional Linear Regression: Effects of
Nullspace and Regularization Demonstrated on Battery Data [0.019064981263344844]
This article considers discrete measured data of underlying smooth latent processes, as is often obtained from chemical or biological systems.
The nullspace and its interplay with regularization shapes regression coefficients.
We show that regularization and z-scoring are design choices that, if chosen corresponding to prior physical knowledge, lead to interpretable regression results.
arXiv Detail & Related papers (2023-09-01T16:20:04Z) - Linear Regression on Manifold Structured Data: the Impact of Extrinsic
Geometry on Solutions [4.8234611688915665]
We study linear regression applied to data structured on a manifold.
We analyze the impact of the manifold's curvatures on the uniqueness of the regression solution.
arXiv Detail & Related papers (2023-07-05T17:51:26Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - An adaptive shortest-solution guided decimation approach to sparse
high-dimensional linear regression [2.3759847811293766]
ASSD is adapted from the shortest solution-guided algorithm and is referred to as ASSD.
ASSD is especially suitable for linear regression problems with highly correlated measurement matrices encountered in real-world applications.
arXiv Detail & Related papers (2022-11-28T04:29:57Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Multivariate Functional Regression via Nested Reduced-Rank
Regularization [2.730097437607271]
We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors.
We show through non-asymptotic analysis that NRRR can achieve at least a comparable error rate to that of the reduced-rank regression.
We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.
arXiv Detail & Related papers (2020-03-10T14:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.