An Improved Algorithm for Learning Drifting Discrete Distributions
- URL: http://arxiv.org/abs/2403.05446v1
- Date: Fri, 8 Mar 2024 16:54:27 GMT
- Title: An Improved Algorithm for Learning Drifting Discrete Distributions
- Authors: Alessio Mazzetto
- Abstract summary: We present a new adaptive algorithm for learning discrete distributions under distribution drift.
We observe a sequence of independent samples from a discrete distribution that is changing over time, and the goal is to estimate the current distribution.
To use more samples, we must resort to samples further in the past, and we incur a drift error due to the bias introduced by the change in distribution.
We present a novel adaptive algorithm that can solve this trade-off without any prior knowledge of the drift.
- Score: 2.2191203337341525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new adaptive algorithm for learning discrete distributions under
distribution drift. In this setting, we observe a sequence of independent
samples from a discrete distribution that is changing over time, and the goal
is to estimate the current distribution. Since we have access to only a single
sample for each time step, a good estimation requires a careful choice of the
number of past samples to use. To use more samples, we must resort to samples
further in the past, and we incur a drift error due to the bias introduced by
the change in distribution. On the other hand, if we use a small number of past
samples, we incur a large statistical error as the estimation has a high
variance. We present a novel adaptive algorithm that can solve this trade-off
without any prior knowledge of the drift. Unlike previous adaptive results, our
algorithm characterizes the statistical error using data-dependent bounds. This
technicality enables us to overcome the limitations of the previous work that
require a fixed finite support whose size is known in advance and that cannot
change over time. Additionally, we can obtain tighter bounds depending on the
complexity of the drifting distribution, and also consider distributions with
infinite support.
Related papers
- Tackling the Problem of Distributional Shifts: Correcting Misspecified, High-Dimensional Data-Driven Priors for Inverse Problems [39.58317527488534]
Data-driven population-level distributions are emerging as an appealing alternative to simple parametric priors in inverse problems.
It is difficult to acquire independent and identically distributed samples from the underlying data-generating process of interest to train these models.
We show that starting from a misspecified prior distribution, the updated distribution becomes progressively closer to the underlying population-level distribution.
arXiv Detail & Related papers (2024-07-24T22:39:27Z) - DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting [14.390842560217743]
We propose a novel approach called DistPred for regression and forecasting tasks.
We transform proper scoring rules that measure the discrepancy between the predicted distribution and the target distribution into a differentiable discrete form.
This allows the model to sample numerous samples in a single forward pass to estimate the potential distribution of the response variable.
arXiv Detail & Related papers (2024-06-17T10:33:00Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Favour: FAst Variance Operator for Uncertainty Rating [0.034530027457862]
Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions.
By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference.
Previous work proposed propagating the first and second moments of the posterior directly through the network.
This method is even slower than sampling, so the propagated variance needs to be approximated.
Our contribution is a more principled variance propagation framework.
arXiv Detail & Related papers (2023-11-21T22:53:20Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Flow Away your Differences: Conditional Normalizing Flows as an
Improvement to Reweighting [0.0]
We present an alternative to reweighting techniques for modifying distributions to account for a desired change in an underlying conditional distribution.
We employ conditional normalizing flows to learn the full conditional probability distribution.
In our examples, this leads to a statistical precision up to three times greater than using reweighting techniques with identical sample sizes for the source and target distributions.
arXiv Detail & Related papers (2023-04-28T16:33:50Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions.
We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts.
We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z) - Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks.
Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target.
Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.