Stochastic Perturbations of Tabular Features for Non-Deterministic
Inference with Automunge
- URL: http://arxiv.org/abs/2202.09248v1
- Date: Fri, 18 Feb 2022 15:24:03 GMT
- Title: Stochastic Perturbations of Tabular Features for Non-Deterministic
Inference with Automunge
- Authors: Nicholas J. Teague
- Abstract summary: Injecting gaussian noise into training features is well known to have regularization properties.
This paper considers noise injections to numeric or categoric tabular features as passed to inference.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Injecting gaussian noise into training features is well known to have
regularization properties. This paper considers noise injections to numeric or
categoric tabular features as passed to inference, which translates inference
to a non-deterministic outcome and may have relevance to fairness
considerations, adversarial example protection, or other use cases benefiting
from non-determinism. We offer the Automunge library for tabular preprocessing
as a resource for the practice, which includes options to integrate random
sampling or entropy seeding with the support of quantum circuits for an
improved randomness profile in comparison to pseudo random number generators.
Benchmarking shows that neural networks may demonstrate an improved performance
when a known noise profile is mitigated with corresponding injections to both
training and inference, and that gradient boosting appears to be robust to a
mild noise profile in inference, suggesting that stochastic perturbations could
be integrated into existing data pipelines for prior trained gradient boosting
models.
Related papers
- Robust Gaussian Processes via Relevance Pursuit [17.39376866275623]
We propose and study a GP model that achieves robustness against sparse outliers by inferring data-point-specific noise levels.
We show, surprisingly, that the model can be parameterized such that the associated log marginal likelihood is strongly concave in the data-point-specific noise variances.
arXiv Detail & Related papers (2024-10-31T17:59:56Z) - Noise-Aware Differentially Private Variational Inference [5.4619385369457225]
Differential privacy (DP) provides robust privacy guarantees for statistical inference, but this can lead to unreliable results and biases in downstream applications.
We propose a novel method for noise-aware approximate Bayesian inference based on gradient variational inference.
We also propose a more accurate evaluation method for noise-aware posteriors.
arXiv Detail & Related papers (2024-10-25T08:18:49Z) - Noise-mitigated randomized measurements and self-calibrating shadow
estimation [0.0]
We introduce an error-mitigated method of randomized measurements, giving rise to a robust shadow estimation procedure.
On the practical side, we show that error mitigation and shadow estimation can be carried out using the same session of quantum experiments.
arXiv Detail & Related papers (2024-03-07T18:53:56Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - May the Noise be with you: Adversarial Training without Adversarial
Examples [3.4673556247932225]
We investigate the question: Can we obtain adversarially-trained models without training on adversarial?
Our proposed approach incorporates inherentity by embedding Gaussian noise within the layers of the NN model at training time.
Our work contributes adversarially trained networks using a completely different approach, with empirically similar robustness to adversarial training.
arXiv Detail & Related papers (2023-12-12T08:22:28Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - Generalized Gumbel-Softmax Gradient Estimator for Various Discrete
Random Variables [16.643346012854156]
Esting the gradients of nodes is one of the crucial research questions in the deep generative modeling community.
This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation.
arXiv Detail & Related papers (2020-03-04T01:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.