Credal Learning Theory
- URL: http://arxiv.org/abs/2402.00957v4
- Date: Wed, 23 Oct 2024 15:40:23 GMT
- Title: Credal Learning Theory
- Authors: Michele Caprio, Maryam Sultana, Eleni Elia, Fabio Cuzzolin,
- Abstract summary: We lay the foundations for a credal' theory of learning, using convex sets of probabilities to model the variability in the data-generating distribution.
Bounds are derived for the case of finite hypotheses spaces, as well as infinite model spaces, which directly generalize classical results.
- Score: 4.64390130376307
- License:
- Abstract: Statistical learning theory is the foundation of machine learning, providing theoretical bounds for the risk of models learned from a (single) training set, assumed to issue from an unknown probability distribution. In actual deployment, however, the data distribution may (and often does) vary, causing domain adaptation/generalization issues. In this paper we lay the foundations for a `credal' theory of learning, using convex sets of probabilities (credal sets) to model the variability in the data-generating distribution. Such credal sets, we argue, may be inferred from a finite sample of training sets. Bounds are derived for the case of finite hypotheses spaces (both assuming realizability or not), as well as infinite model spaces, which directly generalize classical results.
Related papers
- Which distribution were you sampled from? Towards a more tangible conception of data [7.09435109588801]
We argue that the standard framework for machine learning is not always a good model.
We suggest an alternative framework that focuses on finite populations rather than abstract distributions.
arXiv Detail & Related papers (2024-07-24T16:17:14Z) - A Mathematical Framework for Learning Probability Distributions [0.0]
generative modeling and density estimation has become an immensely popular subject in recent years.
This paper provides a mathematical framework such that all the well-known models can be derived based on simple principles.
In particular, we prove that these models enjoy implicit regularization during training, so that the generalization error at early-stopping avoids the curse of dimensionality.
arXiv Detail & Related papers (2022-12-22T04:41:45Z) - First Steps Toward Understanding the Extrapolation of Nonlinear Models
to Unseen Domains [35.76184529520015]
This paper makes some initial steps towards analyzing the extrapolation of nonlinear models for structured domain shift.
We prove that the family of nonlinear models of the form $f(x)=sum f_i(x_i)$, can extrapolate to unseen distributions.
arXiv Detail & Related papers (2022-11-21T18:41:19Z) - Causal Discovery in Heterogeneous Environments Under the Sparse
Mechanism Shift Hypothesis [7.895866278697778]
Machine learning approaches commonly rely on the assumption of independent and identically distributed (i.i.d.) data.
In reality, this assumption is almost always violated due to distribution shifts between environments.
We propose the Mechanism Shift Score (MSS), a score-based approach amenable to various empirical estimators.
arXiv Detail & Related papers (2022-06-04T15:39:30Z) - Fairness Transferability Subject to Bounded Distribution Shift [5.62716254065607]
Given an algorithmic predictor that is "fair" on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound?
We study the transferability of statistical group fairness for machine learning predictors subject to bounded distribution shifts.
arXiv Detail & Related papers (2022-05-31T22:16:44Z) - On some theoretical limitations of Generative Adversarial Networks [77.34726150561087]
It is a general assumption that GANs can generate any probability distribution.
We provide a new result based on Extreme Value Theory showing that GANs can't generate heavy tailed distributions.
arXiv Detail & Related papers (2021-10-21T06:10:38Z) - Instance-Based Neural Dependency Parsing [56.63500180843504]
We develop neural models that possess an interpretable inference process for dependency parsing.
Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set.
arXiv Detail & Related papers (2021-09-28T05:30:52Z) - Theoretical Analysis of Self-Training with Deep Networks on Unlabeled
Data [48.4779912667317]
Self-training algorithms have been very successful for learning with unlabeled data using neural networks.
This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning.
arXiv Detail & Related papers (2020-10-07T19:43:55Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Contextuality scenarios arising from networks of stochastic processes [68.8204255655161]
An empirical model is said contextual if its distributions cannot be obtained marginalizing a joint distribution over X.
We present a different and classical source of contextual empirical models: the interaction among many processes.
The statistical behavior of the network in the long run makes the empirical model generically contextual and even strongly contextual.
arXiv Detail & Related papers (2020-06-22T16:57:52Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.