Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning
- URL: http://arxiv.org/abs/2106.01723v1
- Date: Thu, 3 Jun 2021 09:50:13 GMT
- Title: Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning
- Authors: Aur\'elien Bibaut and Antoine Chambaz and Maria Dimakopoulou and
Nathan Kallus and Mark van der Laan
- Abstract summary: Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
- Score: 57.88785630755165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Empirical risk minimization (ERM) is the workhorse of machine learning,
whether for classification and regression or for off-policy policy learning,
but its model-agnostic guarantees can fail when we use adaptively collected
data, such as the result of running a contextual bandit algorithm. We study a
generic importance sampling weighted ERM algorithm for using adaptively
collected data to minimize the average of a loss function over a hypothesis
class and provide first-of-their-kind generalization guarantees and fast
convergence rates. Our results are based on a new maximal inequality that
carefully leverages the importance sampling structure to obtain rates with the
right dependence on the exploration rate in the data. For regression, we
provide fast rates that leverage the strong convexity of squared-error loss.
For policy learning, we provide rate-optimal regret guarantees that close an
open gap in the existing literature whenever exploration decays to zero, as is
the case for bandit-collected data. An empirical investigation validates our
theory.
Related papers
- DRoP: Distributionally Robust Pruning [11.930434318557156]
We conduct the first systematic study of the impact of data pruning on classification bias of trained models.
We propose DRoP, a distributionally robust approach to pruning and empirically demonstrate its performance on standard computer vision benchmarks.
arXiv Detail & Related papers (2024-04-08T14:55:35Z) - Bi-Level Offline Policy Optimization with Limited Exploration [1.8130068086063336]
We study offline reinforcement learning (RL) which seeks to learn a good policy based on a fixed, pre-collected dataset.
We propose a bi-level structured policy optimization algorithm that models a hierarchical interaction between the policy (upper-level) and the value function (lower-level)
We evaluate our model using a blend of synthetic, benchmark, and real-world datasets for offline RL, showing that it performs competitively with state-of-the-art methods.
arXiv Detail & Related papers (2023-10-10T02:45:50Z) - A Generalized Unbiased Risk Estimator for Learning with Augmented
Classes [70.20752731393938]
Given unlabeled data, an unbiased risk estimator (URE) can be derived, which can be minimized for LAC with theoretical guarantees.
We propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees.
arXiv Detail & Related papers (2023-06-12T06:52:04Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - RATT: Leveraging Unlabeled Data to Guarantee Generalization [96.08979093738024]
We introduce a method that leverages unlabeled data to produce generalization bounds.
We prove that our bound is valid for 0-1 empirical risk minimization.
This work provides practitioners with an option for certifying the generalization of deep nets even when unseen labeled data is unavailable.
arXiv Detail & Related papers (2021-05-01T17:05:29Z) - Continual Learning for Fake Audio Detection [62.54860236190694]
This paper proposes detecting fake without forgetting, a continual-learning-based method, to make the model learn new spoofing attacks incrementally.
Experiments are conducted on the ASVspoof 2019 dataset.
arXiv Detail & Related papers (2021-04-15T07:57:05Z) - Learning from Similarity-Confidence Data [94.94650350944377]
We investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data.
We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate.
arXiv Detail & Related papers (2021-02-13T07:31:16Z) - Active Deep Learning on Entity Resolution by Risk Sampling [5.219701379581547]
Active Learning (AL) presents itself as a feasible solution that focuses on data deemed useful for model training.
We propose a novel AL approach of risk sampling for entity resolution (ER)
Based on the core-set characterization for AL, we theoretically derive an optimization model which aims to minimize core-set loss with non-uniform continuity.
We empirically verify the efficacy of the proposed approach on real data by a comparative study.
arXiv Detail & Related papers (2020-12-23T20:38:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.