Related papers: Doubly Wild Refitting: Model-Free Evaluation of High Dimensional Black-Box Predictions under Convex Losses

Doubly Wild Refitting: Model-Free Evaluation of High Dimensional Black-Box Predictions under Convex Losses

URL: http://arxiv.org/abs/2511.18789v1
Date: Mon, 24 Nov 2025 05:38:47 GMT
Title: Doubly Wild Refitting: Model-Free Evaluation of High Dimensional Black-Box Predictions under Convex Losses
Authors: Haichen Hu, David Simchi-Levi,
Abstract summary: We study the problem of excess risk evaluation for empirical risk minimization under general convex loss functions.<n>Our contribution is an efficient refitting procedure that computes the excess risk and provides high-probability upper bounds under the fixed-design setting.
Score: 15.386375612838371
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of excess risk evaluation for empirical risk minimization (ERM) under general convex loss functions. Our contribution is an efficient refitting procedure that computes the excess risk and provides high-probability upper bounds under the fixed-design setting. Assuming only black-box access to the training algorithm and a single dataset, we begin by generating two sets of artificially modified pseudo-outcomes termed wild response, created by stochastically perturbing the gradient vectors with carefully chosen scaling. Using these two pseudo-labeled datasets, we then refit the black-box procedure twice to obtain two corresponding wild predictors. Finally, leveraging the original predictor, the two wild predictors, and the constructed wild responses, we derive an efficient excess risk upper bound. A key feature of our analysis is that it requires no prior knowledge of the complexity of the underlying function class. As a result, the method is essentially model-free and holds significant promise for theoretically evaluating modern opaque machine learning system--such as deep nerral networks and generative model--where traditional capacity-based learning theory becomes infeasible due to the extreme complexity of the hypothesis class.

Related papers

Incorporating priors in learning: a random matrix study under a teacher-student framework [6.744353807473373]
Regularized linear regression is central to machine learning, yet its high-dimensional behavior with informative priors remains poorly understood.<n>We provide the first exact characterization of training and test risks for maximum a posteriori (MAP) regression.<n>Our framework unifies ridge regression, least squares, and prior-informed estimators, and -- using random matrix theory -- yields closed-form risk formulas.
arXiv Detail & Related papers (2025-09-26T09:47:15Z)
Perturbing the Derivative: Wild Refitting for Model-Free Evaluation of Machine Learning Models under Bregman Losses [15.386375612838371]
We show that one can efficiently upper bound the excess risk through the so-called "wild optimism"<n>Unlike conventional analysis, our framework operates with just one dataset and black-box access to the training procedure.<n>Our work is promising for theoretically evaluating modern opaque ML models, such as deep neural networks and generative models.
arXiv Detail & Related papers (2025-09-02T16:26:03Z)
Wild refitting for black box prediction [29.715181593057803]
We describe and analyze a computionally efficient refitting procedure for computing high-probability upper bounds on the instance-wise mean-squared prediction error.<n>Requiring only a single dataset and black box access to the prediction method, it consists of three steps: computing suitable residuals, symmetrizing and scaling them with a pre-factor $rho$, and using them to define and solve a modified prediction problem.
arXiv Detail & Related papers (2025-06-26T16:41:55Z)
Beyond Expectations: Learning with Stochastic Dominance Made Practical [88.06211893690964]
dominance models risk-averse preferences for decision making with uncertain outcomes. Despite theoretically appealing, the application of dominance in machine learning has been scarce. We first generalize the dominance concept to enable feasible comparisons between any arbitrary pair of random variables. We then develop a simple and efficient approach for finding the optimal solution in terms of dominance.
arXiv Detail & Related papers (2024-02-05T03:21:23Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions. We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension. We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z)
Mitigating multiple descents: A model-agnostic framework for risk monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation. We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z)
The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data. We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward. For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z)
Learning with Gradient Descent and Weakly Convex Losses [14.145079120746614]
We study the learning performance of gradient descent when the empirical risk is weakly convex. In the case of a two layer neural network, we demonstrate that the empirical risk can satisfy a notion of local weak convexity.
arXiv Detail & Related papers (2021-01-13T09:58:06Z)
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training [2.9649783577150837]
We study the effect of mini-batching on the loss landscape of deep neural networks using spiked, field-dependent random matrix theory. We derive analytical expressions for the maximal descent and adaptive training regimens for smooth, non-Newton deep neural networks. We validate our claims on the VGG/ResNet and ImageNet datasets.
arXiv Detail & Related papers (2020-06-16T11:55:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.