Generalization Bounds for Dependent Data using Online-to-Batch Conversion
- URL: http://arxiv.org/abs/2405.13666v1
- Date: Wed, 22 May 2024 14:07:25 GMT
- Title: Generalization Bounds for Dependent Data using Online-to-Batch Conversion
- Authors: Sagnik Chatterjee, Manuj Mukherjee, Alhad Sethi,
- Abstract summary: We show that the generalization error of statistical learners in the dependent data setting is equivalent to the generalization error of statistical learners in the i.i.d. setting.
Our proof techniques involve defining a new notion of stability of online learning algorithms based on Wasserstein.
- Score: 0.6144680854063935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we give generalization bounds of statistical learning algorithms trained on samples drawn from a dependent data source, both in expectation and with high probability, using the Online-to-Batch conversion paradigm. We show that the generalization error of statistical learners in the dependent data setting is equivalent to the generalization error of statistical learners in the i.i.d. setting up to a term that depends on the decay rate of the underlying mixing stochastic process and is independent of the complexity of the statistical learner. Our proof techniques involve defining a new notion of stability of online learning algorithms based on Wasserstein distances and employing "near-martingale" concentration bounds for dependent random variables to arrive at appropriate upper bounds for the generalization error of statistical learners trained on dependent data.
Related papers
- Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Beyond Normal: On the Evaluation of Mutual Information Estimators [52.85079110699378]
We show how to construct a diverse family of distributions with known ground-truth mutual information.
We provide guidelines for practitioners on how to select appropriate estimator adapted to the difficulty of problem considered.
arXiv Detail & Related papers (2023-06-19T17:26:34Z) - Online-to-PAC Conversions: Generalization Bounds via Regret Analysis [13.620177497267791]
We construct an online learning game called the "generalization game"
We show that the existence of an online learning algorithm with bounded regret in this game implies a bound on the generalization error of the statistical learning algorithm.
arXiv Detail & Related papers (2023-05-31T09:15:39Z) - Training Normalizing Flows from Dependent Data [31.42053454078623]
We propose a likelihood objective of normalizing flows incorporating dependencies between the data points.
We show that respecting dependencies between observations can improve empirical results on both synthetic and real-world data.
arXiv Detail & Related papers (2022-09-29T16:50:34Z) - Federated Learning with Heterogeneous Data: A Superquantile Optimization
Approach [0.0]
We present a federated learning framework that is designed to robustly deliver good performance across individual clients with heterogeneous data.
The proposed approach hinges upon aquantile-based learning training that captures the tail statistics of the error.
arXiv Detail & Related papers (2021-12-17T11:00:23Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z) - Statistical Inference for High-Dimensional Linear Regression with
Blockwise Missing Data [13.48481978963297]
Blockwise missing data occurs when we integrate multisource or multimodality data where different sources or modalities contain complementary information.
We propose a computationally efficient estimator for the regression coefficient vector based on carefully constructed unbiased estimating equations.
Numerical studies and application analysis of the Alzheimer's Disease Neuroimaging Initiative data show that the proposed method performs better and benefits more from unsupervised samples than existing methods.
arXiv Detail & Related papers (2021-06-07T05:12:42Z) - Causal learning with sufficient statistics: an information bottleneck
approach [3.720546514089338]
Methods extracting causal information from conditional independencies between variables of a system are common.
We capitalize on the fact that the laws governing the generative mechanisms of a system often result in substructures embodied in the generative functional equation of a variable.
We propose to use the Information Bottleneck method, a technique commonly applied for dimensionality reduction, to find underlying sufficient sets of statistics.
arXiv Detail & Related papers (2020-10-12T00:20:01Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.