Implicit Regularization Paths of Weighted Neural Representations
- URL: http://arxiv.org/abs/2408.15784v1
- Date: Wed, 28 Aug 2024 13:26:36 GMT
- Title: Implicit Regularization Paths of Weighted Neural Representations
- Authors: Jin-Hong Du, Pratik Patil,
- Abstract summary: We study the implicit regularization effects induced by (observation) of pretrained features.
We show that ridge estimators trained on weighted features along the same path are infiniteally equivalent when evaluated against test vectors of bounded norms.
- Score: 2.6534407766508177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the implicit regularization effects induced by (observation) weighting of pretrained features. For weight and feature matrices of bounded operator norms that are infinitesimally free with respect to (normalized) trace functionals, we derive equivalence paths connecting different weighting matrices and ridge regularization levels. Specifically, we show that ridge estimators trained on weighted features along the same path are asymptotically equivalent when evaluated against test vectors of bounded norms. These paths can be interpreted as matching the effective degrees of freedom of ridge estimators fitted with weighted features. For the special case of subsampling without replacement, our results apply to independently sampled random features and kernel features and confirm recent conjectures (Conjectures 7 and 8) of the authors on the existence of such paths in Patil et al. We also present an additive risk decomposition for ensembles of weighted estimators and show that the risks are equivalent along the paths when the ensemble size goes to infinity. As a practical consequence of the path equivalences, we develop an efficient cross-validation method for tuning and apply it to subsampled pretrained representations across several models (e.g., ResNet-50) and datasets (e.g., CIFAR-100).
Related papers
- Semi-parametric Functional Classification via Path Signatures Logistic Regression [1.210026603224224]
We propose Path Signatures Logistic Regression, a semi-parametric framework for classifying vector-valued functional data.<n>Our results highlight the practical and theoretical benefits of integrating rough path theory into modern functional data analysis.
arXiv Detail & Related papers (2025-07-09T08:06:50Z) - Asymptotics of Linear Regression with Linearly Dependent Data [28.005935031887038]
We study the computations of linear regression in settings with non-Gaussian covariates.
We show how dependencies influence estimation error and the choice of regularization parameters.
arXiv Detail & Related papers (2024-12-04T20:31:47Z) - Refined Risk Bounds for Unbounded Losses via Transductive Priors [58.967816314671296]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression.
Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z) - Assumption-Lean Post-Integrated Inference with Negative Control Outcomes [0.0]
We introduce a robust post-integrated inference (PII) method that adjusts for latent heterogeneity using negative control outcomes.
Our method extends to projected direct effect estimands, accounting for hidden mediators, confounders, and moderators.
The proposed doubly robust estimators are consistent and efficient under minimal assumptions and potential misspecification.
arXiv Detail & Related papers (2024-10-07T12:52:38Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Generalized equivalences between subsampling and ridge regularization [3.1346887720803505]
We prove structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators.
An indirect implication of our equivalences is that optimally tuned ridge regression exhibits a monotonic prediction risk in the data aspect ratio.
arXiv Detail & Related papers (2023-05-29T14:05:51Z) - Covariate balancing using the integral probability metric for causal
inference [1.8899300124593648]
In this paper, we consider to use the integral probability metric (IPM) which is a metric between two probability measures.
We prove that the corresponding estimator can be consistent without correctly specifying any model.
Our proposed method outperforms existing weighting methods with large margins for finite samples.
arXiv Detail & Related papers (2023-05-23T06:06:45Z) - Semi-parametric inference based on adaptively collected data [34.56133468275712]
We construct suitably weighted estimating equations that account for adaptivity in data collection.
Our results characterize the degree of "explorability" required for normality to hold.
We illustrate our general theory with concrete consequences for various problems, including standard linear bandits and sparse generalized bandits.
arXiv Detail & Related papers (2023-03-05T00:45:32Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.