Related papers: Inference with non-differentiable surrogate loss in a general high-dimensional classification framework

Inference with non-differentiable surrogate loss in a general high-dimensional classification framework

URL: http://arxiv.org/abs/2405.11723v1
Date: Mon, 20 May 2024 01:50:35 GMT
Title: Inference with non-differentiable surrogate loss in a general high-dimensional classification framework
Authors: Muxuan Liang, Yang Ning, Maureen A Smith, Ying-Qi Zhao,
Abstract summary: We propose a kernel-smoothed decorrelated score to construct hypothesis testing and interval estimations. Specifically, we adopt kernel approximations to smooth the discontinuous gradient near discontinuity points. We establish the limiting distribution of the kernel-smoothed decorrelated score and its cross-fitted version in a high-dimensional setup.
Score: 4.792322531593389
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Penalized empirical risk minimization with a surrogate loss function is often used to derive a high-dimensional linear decision rule in classification problems. Although much of the literature focuses on the generalization error, there is a lack of valid inference procedures to identify the driving factors of the estimated decision rule, especially when the surrogate loss is non-differentiable. In this work, we propose a kernel-smoothed decorrelated score to construct hypothesis testing and interval estimations for the linear decision rule estimated using a piece-wise linear surrogate loss, which has a discontinuous gradient and non-regular Hessian. Specifically, we adopt kernel approximations to smooth the discontinuous gradient near discontinuity points and approximate the non-regular Hessian of the surrogate loss. In applications where additional nuisance parameters are involved, we propose a novel cross-fitted version to accommodate flexible nuisance estimates and kernel approximations. We establish the limiting distribution of the kernel-smoothed decorrelated score and its cross-fitted version in a high-dimensional setup. Simulation and real data analysis are conducted to demonstrate the validity and superiority of the proposed method.

Related papers

Wasserstein Distributionally Robust Nonparametric Regression [9.65010022854885]
This paper studies the generalization properties of Wasserstein distributionally robust nonparametric estimators.<n>We establish non-asymptotic error bounds for the excess local worst-case risk.<n>The robustness of the proposed estimator is evaluated through simulation studies and illustrated with an application to the MNIST dataset.
arXiv Detail & Related papers (2025-05-12T18:07:37Z)
An Accelerated Alternating Partial Bregman Algorithm for ReLU-based Matrix Decomposition [0.0]
In this paper, we aim to investigate the sparse low-rank characteristics rectified on non-negative matrices. We propose a novel regularization term incorporating useful structures in clustering and compression tasks. We derive corresponding closed-form solutions while maintaining the $L$-smooth property always holds for any $Lge 1$.
arXiv Detail & Related papers (2025-03-04T08:20:34Z)
Adaptive Conformal Inference by Betting [51.272991377903274]
We consider the problem of adaptive conformal inference without any assumptions about the data generating process. Existing approaches for adaptive conformal inference are based on optimizing the pinball loss using variants of online gradient descent. We propose a different approach for adaptive conformal inference that leverages parameter-free online convex optimization techniques.
arXiv Detail & Related papers (2024-12-26T18:42:08Z)
Refined Risk Bounds for Unbounded Losses via Transductive Priors [58.967816314671296]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression. Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z)
Assumption-Lean Post-Integrated Inference with Negative Control Outcomes [0.0]
We introduce a robust post-integrated inference (PII) method that adjusts for latent heterogeneity using negative control outcomes. Our method extends to projected direct effect estimands, accounting for hidden mediators, confounders, and moderators. The proposed doubly robust estimators are consistent and efficient under minimal assumptions and potential misspecification.
arXiv Detail & Related papers (2024-10-07T12:52:38Z)
Decision-Focused Learning with Directional Gradients [1.2363103948638432]
We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses is a Lipschitz continuous, difference of concave functions. We provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified.
arXiv Detail & Related papers (2024-02-05T18:14:28Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z)
Nonparametric Quantile Regression: Non-Crossing Constraints and Conformal Prediction [2.654399717608053]
We propose a nonparametric quantile regression method using deep neural networks with a rectified linear unit penalty function to avoid quantile crossing. We establish non-asymptotic upper bounds for the excess risk of the proposed nonparametric quantile regression function estimators. Numerical experiments including simulation studies and a real data example are conducted to demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-10-18T20:59:48Z)
Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing. We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Distribution-Free Robust Linear Regression [5.532477732693]
We study random design linear regression with no assumptions on the distribution of the covariates. We construct a non-linear estimator achieving excess risk of order $d/n$ with the optimal sub-exponential tail. We prove an optimal version of the classical bound for the truncated least squares estimator due to Gy"orfi, Kohler, Krzyzak, and Walk.
arXiv Detail & Related papers (2021-02-25T15:10:41Z)
Stopping Criterion Design for Recursive Bayesian Classification: Analysis and Decision Geometry [11.399206131178104]
We propose a geometric interpretation over the state posterior progression. We show that confidence thresholds defined over maximum of the state posteriors suffer from stiffness. We then propose a new stopping/termination criterion with a geometrical insight to overcome the limitations.
arXiv Detail & Related papers (2020-07-30T16:21:10Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.