Related papers: Inject Machine Learning into Significance Test for Misspecified Linear Models

Inject Machine Learning into Significance Test for Misspecified Linear Models

URL: http://arxiv.org/abs/2006.03167v1
Date: Thu, 4 Jun 2020 23:22:04 GMT
Title: Inject Machine Learning into Significance Test for Misspecified Linear Models
Authors: Jiaye Teng and Yang Yuan
Abstract summary: We present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. Experimental results show that our estimator significantly outperforms linear regression for non-linear ground truth functions.
Score: 14.672773981251574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to its strong interpretability, linear regression is widely used in social science, from which significance test provides the significance level of models or coefficients in the traditional statistical inference. However, linear regression methods rely on the linear assumptions of the ground truth function, which do not necessarily hold in practice. As a result, even for simple non-linear cases, linear regression may fail to report the correct significance level. In this paper, we present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation. Afterward, we get the estimator by adding adjustments based on the validation set. We prove the concentration inequalities and asymptotic properties of our estimator, which leads to the corresponding significance test. Experimental results show that our estimator significantly outperforms linear regression for non-linear ground truth functions, indicating that our estimator might be a better tool for the significance test.

Related papers

Semi-supervised learning for linear extremile regression [0.8973184739267972]
This paper introduces a novel definition of linear extremile regression along with an accompanying estimation methodology.<n>The regression coefficient estimators of this method achieve $sqrtn$-consistency, which nonparametric extremile regression may not provide.<n>We propose a semi-supervised learning approach to enhance estimation efficiency, even when the specified linear extremile regression model may be misspecified.
arXiv Detail & Related papers (2025-07-02T02:59:15Z)
Importance Sampling for Nonlinear Models [5.421981644827842]
We introduce the concept of the adjoint operator of a nonlinear map.<n>We demonstrate that sampling based on these notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping.
arXiv Detail & Related papers (2025-05-18T10:34:39Z)
Interpretation of High-Dimensional Regression Coefficients by Comparison with Linearized Compressing Features [0.0]
We focus on understanding how linear regression approximates nonlinear responses from high-dimensional functional data, motivated by predicting cycle life for lithium-ion batteries. We develop a linearization method to derive feature coefficients, which we compare with the closest regression coefficients of the path of regression solutions.
arXiv Detail & Related papers (2024-11-18T20:59:38Z)
Automatic doubly robust inference for linear functionals via calibrated debiased machine learning [0.9694940903078658]
We propose a debiased machine learning estimator for doubly robust inference. A C-DML estimator maintains linearity when either the outcome regression or the Riesz representer of the linear functional is estimated sufficiently well. Our theoretical and empirical results support the use of C-DML to mitigate bias arising from the inconsistent or slow estimation of nuisance functions.
arXiv Detail & Related papers (2024-11-05T03:32:30Z)
Bayesian Inference for Consistent Predictions in Overparameterized Nonlinear Regression [0.0]
This study explores the predictive properties of over parameterized nonlinear regression within the Bayesian framework. Posterior contraction is established for generalized linear and single-neuron models with Lipschitz continuous activation functions. The proposed method was validated via numerical simulations and a real data application.
arXiv Detail & Related papers (2024-04-06T04:22:48Z)
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning [53.97335841137496]
We propose an oracle-efficient algorithm, dubbed Pessimistic Least-Square Value Iteration (PNLSVI) for offline RL with non-linear function approximation. Our algorithm enjoys a regret bound that has a tight dependency on the function class complexity and achieves minimax optimal instance-dependent regret when specialized to linear function approximation.
arXiv Detail & Related papers (2023-10-02T17:42:01Z)
Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z)
On the Detection and Quantification of Nonlinearity via Statistics of the Gradients of a Black-Box Model [0.0]
Detection and identification of nonlinearity is a task of high importance for structural dynamics. A method to detect nonlinearity is proposed, based on the distribution of the gradients of a data-driven model.
arXiv Detail & Related papers (2023-02-15T23:15:22Z)
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism [65.46524775457928]
offline reinforcement learning seeks to utilize offline/historical data to optimize sequential decision-making strategies. We study the statistical limits of offline reinforcement learning with linear model representations.
arXiv Detail & Related papers (2022-03-11T09:00:12Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning. LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z)
Causality-aware counterfactual confounding adjustment as an alternative to linear residualization in anticausal prediction tasks based on linear learners [14.554818659491644]
We compare the linear residualization approach against the causality-aware confounding adjustment in anticausal prediction tasks. We show that the causality-aware approach tends to (asymptotically) outperform the residualization adjustment in terms of predictive performance in linear learners.
arXiv Detail & Related papers (2020-11-09T17:59:57Z)
Non-parametric Models for Non-negative Functions [48.7576911714538]
We provide the first model for non-negative functions from the same good linear models. We prove that it admits a representer theorem and provide an efficient dual formulation for convex problems.
arXiv Detail & Related papers (2020-07-08T07:17:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.