Stochastic Perturbations of Tabular Features for Non-Deterministic
Inference with Automunge
- URL: http://arxiv.org/abs/2202.09248v1
- Date: Fri, 18 Feb 2022 15:24:03 GMT
- Title: Stochastic Perturbations of Tabular Features for Non-Deterministic
Inference with Automunge
- Authors: Nicholas J. Teague
- Abstract summary: Injecting gaussian noise into training features is well known to have regularization properties.
This paper considers noise injections to numeric or categoric tabular features as passed to inference.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Injecting gaussian noise into training features is well known to have
regularization properties. This paper considers noise injections to numeric or
categoric tabular features as passed to inference, which translates inference
to a non-deterministic outcome and may have relevance to fairness
considerations, adversarial example protection, or other use cases benefiting
from non-determinism. We offer the Automunge library for tabular preprocessing
as a resource for the practice, which includes options to integrate random
sampling or entropy seeding with the support of quantum circuits for an
improved randomness profile in comparison to pseudo random number generators.
Benchmarking shows that neural networks may demonstrate an improved performance
when a known noise profile is mitigated with corresponding injections to both
training and inference, and that gradient boosting appears to be robust to a
mild noise profile in inference, suggesting that stochastic perturbations could
be integrated into existing data pipelines for prior trained gradient boosting
models.
Related papers
- Noise-mitigated randomized measurements and self-calibrating shadow
estimation [0.0]
We introduce an error-mitigated method of randomized measurements, giving rise to a robust shadow estimation procedure.
On the practical side, we show that error mitigation and shadow estimation can be carried out using the same session of quantum experiments.
arXiv Detail & Related papers (2024-03-07T18:53:56Z) - Risk-Sensitive Diffusion for Perturbation-Robust Optimization [58.68233326265417]
We show that noisy samples incur another objective function, rather than the one with score function, which will wrongly optimize the model.
We introduce risk-sensitive SDE, a type of differential equation (SDE) parameterized by the risk vector.
We prove that zero instability measure is only achievable in the case where noisy samples are caused by Gaussian perturbation.
arXiv Detail & Related papers (2024-02-03T08:41:51Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - May the Noise be with you: Adversarial Training without Adversarial
Examples [3.4673556247932225]
We investigate the question: Can we obtain adversarially-trained models without training on adversarial?
Our proposed approach incorporates inherentity by embedding Gaussian noise within the layers of the NN model at training time.
Our work contributes adversarially trained networks using a completely different approach, with empirically similar robustness to adversarial training.
arXiv Detail & Related papers (2023-12-12T08:22:28Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - Modeling Temporal Data as Continuous Functions with Stochastic Process
Diffusion [2.2849153854336763]
temporal data can be viewed as discretized measurements of the underlying function.
To build a generative model for such data we have to model the process that governs it.
We propose a solution by defining the denoising diffusion model in the function space.
arXiv Detail & Related papers (2022-11-04T17:02:01Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z) - Generalized Gumbel-Softmax Gradient Estimator for Various Discrete
Random Variables [16.643346012854156]
Esting the gradients of nodes is one of the crucial research questions in the deep generative modeling community.
This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation.
arXiv Detail & Related papers (2020-03-04T01:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.