Uniform Risk Bounds for Learning with Dependent Data Sequences
- URL: http://arxiv.org/abs/2303.11650v1
- Date: Tue, 21 Mar 2023 07:51:52 GMT
- Title: Uniform Risk Bounds for Learning with Dependent Data Sequences
- Authors: Fabien Lauer (ABC)
- Abstract summary: This paper extends standard results from learning theory with independent data to sequences of dependent data.
We do not rely on mixing arguments or sequential measures of complexity and derive uniform risk bounds with classical proof patterns and capacity measures.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper extends standard results from learning theory with independent
data to sequences of dependent data. Contrary to most of the literature, we do
not rely on mixing arguments or sequential measures of complexity and derive
uniform risk bounds with classical proof patterns and capacity measures. In
particular, we show that the standard classification risk bounds based on the
VC-dimension hold in the exact same form for dependent data, and further
provide Rademacher complexity-based bounds, that remain unchanged compared to
the standard results for the identically and independently distributed case.
Finally, we show how to apply these results in the context of scenario-based
optimization in order to compute the sample complexity of random programs with
dependent constraints.
Related papers
- On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds [11.30047438005394]
This work investigates the question of how to choose the regularization norm $lVert cdot rVert$ in the context of high-dimensional adversarial training for binary classification.
We quantitatively characterize the relationship between perturbation size and the optimal choice of $lVert cdot rVert$, confirming the intuition that, in the data scarce regime, the type of regularization becomes increasingly important for adversarial training as perturbations grow in size.
arXiv Detail & Related papers (2024-10-21T14:53:12Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Generalization Bounds for Dependent Data using Online-to-Batch Conversion [0.6144680854063935]
We show that the generalization error of statistical learners in the dependent data setting is equivalent to the generalization error of statistical learners in the i.i.d. setting.
Our proof techniques involve defining a new notion of stability of online learning algorithms based on Wasserstein.
arXiv Detail & Related papers (2024-05-22T14:07:25Z) - Inference With Combining Rules From Multiple Differentially Private Synthetic Datasets [0.0]
We study the applicability of procedures based on combining rules to the analysis of DIPS datasets.
Our empirical experiments show that the proposed combining rules may offer accurate inference in certain contexts, but not in all cases.
arXiv Detail & Related papers (2024-05-08T02:33:35Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Representation Disentaglement via Regularization by Causal
Identification [3.9160947065896803]
We propose the use of a causal collider structured model to describe the underlying data generative process assumptions in disentangled representation learning.
For this, we propose regularization by identification (ReI), a modular regularization engine designed to align the behavior of large scale generative models with the disentanglement constraints imposed by causal identification.
arXiv Detail & Related papers (2023-02-28T23:18:54Z) - Continuous-Time Modeling of Counterfactual Outcomes Using Neural
Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare.
Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions.
We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z) - Benign Overfitting in Time Series Linear Model with
Over-Parameterization [5.68558935178946]
We develop a theory for excess risk of the estimator under multiple dependence types.
We show that the convergence rate of risks with short-memory processes is identical to that of cases with independent data.
arXiv Detail & Related papers (2022-04-18T15:26:58Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.