Distribution-free Deviation Bounds and The Role of Domain Knowledge in Learning via Model Selection with Cross-validation Risk Estimation
- URL: http://arxiv.org/abs/2303.08777v2
- Date: Fri, 24 May 2024 14:19:27 GMT
- Title: Distribution-free Deviation Bounds and The Role of Domain Knowledge in Learning via Model Selection with Cross-validation Risk Estimation
- Authors: Diego Marcondes, Cláudia Peixoto,
- Abstract summary: Cross-validation techniques for risk estimation and model selection are widely used in statistics and machine learning.
This paper presents learning via model selection with cross-validation risk estimation as a general systematic learning framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cross-validation techniques for risk estimation and model selection are widely used in statistics and machine learning. However, the understanding of the theoretical properties of learning via model selection with cross-validation risk estimation is quite low in face of its widespread use. In this context, this paper presents learning via model selection with cross-validation risk estimation as a general systematic learning framework within classical statistical learning theory and establishes distribution-free deviation bounds in terms of VC dimension, giving detailed proofs of the results and considering both bounded and unbounded loss functions. In particular, we investigate how the generalization of learning via model selection may be increased by modeling the collection of candidate models. We define the Learning Spaces as a class of candidate models in which the partial order by inclusion reflects the models complexities, and we formalize a manner of defining them based on domain knowledge. We illustrate this modeling in a worst-case scenario of learning a classifier with finite domain and a typical scenario of linear regression. Through theoretical insights and concrete examples, we aim to provide guidance on selecting the family of candidate models based on domain knowledge to increase generalization.
Related papers
- Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Transfer learning with affine model transformation [18.13383101189326]
This paper presents a general class of transfer learning regression called affine model transfer.
It is shown that the affine model transfer broadly encompasses various existing methods, including the most common procedure based on neural feature extractors.
arXiv Detail & Related papers (2022-10-18T10:50:24Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Prototypical Model with Novel Information-theoretic Loss Function for
Generalized Zero Shot Learning [3.870962269034544]
Generalized zero shot learning (GZSL) is still a technical challenge of deep learning.
We address the quantification of the knowledge transfer and semantic relation from an information-theoretic viewpoint.
We propose three information-theoretic loss functions for deterministic GZSL model.
arXiv Detail & Related papers (2021-12-06T16:01:46Z) - Causality and Generalizability: Identifiability and Learning Methods [0.0]
This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust prediction methods.
We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization.
We propose a general framework for distributional robustness with respect to intervention-induced distributions.
arXiv Detail & Related papers (2021-10-04T13:12:11Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - On Statistical Efficiency in Learning [37.08000833961712]
We address the challenge of model selection to strike a balance between model fitting and model complexity.
We propose an online algorithm that sequentially expands the model complexity to enhance selection stability and reduce cost.
Experimental studies show that the proposed method has desirable predictive power and significantly less computational cost than some popular methods.
arXiv Detail & Related papers (2020-12-24T16:08:29Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z) - PAC Bounds for Imitation and Model-based Batch Learning of Contextual
Markov Decision Processes [31.83144400718369]
We consider the problem of batch multi-task reinforcement learning with observed context descriptors, motivated by its application to personalized medical treatment.
We study two general classes of learning algorithms: direct policy learning (DPL), an imitation-learning based approach which learns from expert trajectories, and model-based learning.
arXiv Detail & Related papers (2020-06-11T11:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.