Deviation bound for non-causal machine learning
- URL: http://arxiv.org/abs/2009.08905v2
- Date: Fri, 19 Mar 2021 16:55:42 GMT
- Title: Deviation bound for non-causal machine learning
- Authors: R\'emy Garnier and Rapha\"el Langhendries
- Abstract summary: Concentration inequalities are widely used for analyzing machine learning algorithms.
Current concentration inequalities cannot be applied to some of the most popular deep neural networks.
In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concentration inequalities are widely used for analyzing machine learning
algorithms. However, current concentration inequalities cannot be applied to
some of the most popular deep neural networks, notably in natural language
processing. This is mostly due to the non-causal nature of such involved data,
in the sense that each data point depends on other neighbor data points. In
this paper, a framework for modeling non-causal random fields is provided and a
Hoeffding-type concentration inequality is obtained for this framework. The
proof of this result relies on a local approximation of the non-causal random
field by a function of a finite number of i.i.d. random variables.
Related papers
- Universal approximation property of Banach space-valued random feature models including random neural networks [3.3379026542599934]
We introduce a Banach space-valued extension of random feature learning.
By randomly initializing the feature maps, only the linear readout needs to be trained.
We derive approximation rates and an explicit algorithm to learn an element of the given Banach space.
arXiv Detail & Related papers (2023-12-13T11:27:15Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Rough Randomness and its Application [0.0]
This research aims to capture a variety of rough processes, construct related models, and explore the validity of other machine learning algorithms.
A class of rough random functions termed large-minded reasoners have a central role in these.
arXiv Detail & Related papers (2023-03-21T12:22:33Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Mean-field neural networks: learning mappings on Wasserstein space [0.0]
We study the machine learning task for models with operators mapping between the Wasserstein space of probability measures and a space of functions.
Two classes of neural networks are proposed to learn so-called mean-field functions.
We present different algorithms relying on mean-field neural networks for solving time-dependent mean-field problems.
arXiv Detail & Related papers (2022-10-27T05:11:42Z) - Posterior and Computational Uncertainty in Gaussian Processes [52.26904059556759]
Gaussian processes scale prohibitively with the size of the dataset.
Many approximation methods have been developed, which inevitably introduce approximation error.
This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior.
We develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended.
arXiv Detail & Related papers (2022-05-30T22:16:25Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Hyperdimensional Computing for Efficient Distributed Classification with
Randomized Neural Networks [5.942847925681103]
We study distributed classification, which can be employed in situations were data cannot be stored at a central location nor shared.
We propose a more efficient solution for distributed classification by making use of a lossy compression approach applied when sharing the local classifiers with other agents.
arXiv Detail & Related papers (2021-06-02T01:33:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.