Related papers: Closed-Form Beta Distribution Estimation from Sparse Statistics with Random Forest Implicit Regularization

Closed-Form Beta Distribution Estimation from Sparse Statistics with Random Forest Implicit Regularization

URL: http://arxiv.org/abs/2507.23767v2
Date: Fri, 07 Nov 2025 03:06:32 GMT
Title: Closed-Form Beta Distribution Estimation from Sparse Statistics with Random Forest Implicit Regularization
Authors: Jonathan R. Landers,
Abstract summary: This work advances distribution recovery from sparse data and ensemble classification through three main contributions.<n>First, we introduce a closed-form estimator that reconstructs scaled beta distributions from limited statistics.<n>Second, we establish a link between classification accuracy and distributional closeness by deriving error bounds.<n>Third, we show that zero-variance features act as an implicit regularizer, increasing selection probability for mid-ranked predictors.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work advances distribution recovery from sparse data and ensemble classification through three main contributions. First, we introduce a closed-form estimator that reconstructs scaled beta distributions from limited statistics (minimum, maximum, mean, and median) via composite quantile and moment matching. The recovered parameters $(\alpha,\beta)$, when used as features in Random Forest classifiers, improve pairwise classification on time-series snapshots, validating the fidelity of the recovered distributions. Second, we establish a link between classification accuracy and distributional closeness by deriving error bounds that constrain total variation distance and Jensen-Shannon divergence, the latter exhibiting quadratic convergence. Third, we show that zero-variance features act as an implicit regularizer, increasing selection probability for mid-ranked predictors and producing deeper, more varied trees. A SeatGeek pricing dataset serves as the primary application, illustrating distributional recovery and event-level classification while situating these methods within the structure and dynamics of the secondary ticket marketplace. The UCI handwritten digits dataset confirms the broader regularization effect. Overall, the study outlines a practical route from sparse distributional snapshots to closed-form estimation and improved ensemble accuracy, with reliability enhanced through implicit regularization.

Related papers

Nearly Optimal Bayesian Inference for Structural Missingness [23.988531482641307]
In the Bayesian view, prediction via the posterior predictive distribution integrates over the full model posterior uncertainty.<n>This framework decouples learning an in-model missing-value posterior from (ii) label prediction by optimizing the predictive posterior distribution.<n>It achieves SOTA on 43 classification and 15 imputation benchmarks, with finite-sample near Bayes-optimality guarantees.
arXiv Detail & Related papers (2026-01-26T14:03:11Z)
Localized Uncertainty Quantification in Random Forests via Proximities [1.0195618602298684]
In machine learning, uncertainty quantification helps assess the reliability of model predictions.<n>Traditional approaches often emphasize predictive accuracy, but there is a growing focus on incorporating uncertainty measures.<n>We propose a new approach using naturally occurring test sets and similarity measures (proximities) typically viewed as byproducts of random forests.
arXiv Detail & Related papers (2025-09-26T20:53:28Z)
Measuring training variability from stochastic optimization using robust nonparametric testing [5.519968037738177]
We propose a robust hypothesis testing framework and a novel summary statistic, the $alpha$-trimming level, to measure model similarity.<n>Applying hypothesis testing directly with the $alpha$-trimming level is challenging because we cannot accurately describe the distribution under the null hypothesis.<n>We show how to use the $alpha$-trimming level to measure model variability and demonstrate experimentally that it is more expressive than performance metrics.
arXiv Detail & Related papers (2024-06-12T15:08:15Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z)
Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees [11.841312820944774]
We propose a measure -- that we call $textitStability$ -- to quantify the robustness of counterfactuals to potential model changes for differentiable models. Our main contribution is to show that counterfactuals with sufficiently high value of $textitStability$ will remain valid after potential model changes with high probability.
arXiv Detail & Related papers (2023-05-19T20:48:05Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations [44.99833362998488]
Changes in the data distribution at test time can have deleterious effects on the performance of predictive models. We propose a test-time label shift correction that adapts to changes in the joint distribution $p(y, z)$ using EM applied to unlabeled samples.
arXiv Detail & Related papers (2022-11-28T18:52:33Z)
Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions. Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z)
Adaptive Dimension Reduction and Variational Inference for Transductive Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction. Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z)
Robust Calibration with Multi-domain Temperature Scaling [86.07299013396059]
We develop a systematic calibration model to handle distribution shifts by leveraging data from multiple domains. Our proposed method -- multi-domain temperature scaling -- uses the robustness in the domains to improve calibration under distribution shift.
arXiv Detail & Related papers (2022-06-06T17:32:12Z)
Active Learning by Feature Mixing [52.16150629234465]
We propose a novel method for batch active learning called ALFA-Mix. We identify unlabelled instances with sufficiently-distinct features by seeking inconsistencies in predictions. We show that inconsistencies in these predictions help discovering features that the model is unable to recognise in the unlabelled instances.
arXiv Detail & Related papers (2022-03-14T12:20:54Z)
Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition [98.25592165484737]
We propose a more effective pseudo-labeling scheme, called Cross-Model Pseudo-Labeling (CMPL) CMPL achieves $17.6%$ and $25.1%$ Top-1 accuracy on Kinetics-400 and UCF-101 using only the RGB modality and $1%$ labeled data, respectively.
arXiv Detail & Related papers (2021-12-17T18:59:41Z)
Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters. We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
A New Robust Multivariate Mode Estimator for Eye-tracking Calibration [0.0]
We propose a new method for estimating the main mode of multivariate distributions, with application to eye-tracking calibrations. In this type of multimodal distributions, most central tendency measures fail at estimating the principal fixation coordinates. Here, we developed a new algorithm to identify the first mode of multivariate distributions, named BRIL. We obtained outstanding performances, even for distributions containing very high proportions of outliers, both grouped in clusters and randomly distributed.
arXiv Detail & Related papers (2021-07-16T17:45:19Z)
Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions. We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts. We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Distribution-free binary classification: prediction sets, confidence intervals and calibration [106.50279469344937]
We study three notions of uncertainty quantification -- calibration, confidence intervals and prediction sets -- for binary classification in the distribution-free setting. We derive confidence intervals for binned probabilities for both fixed-width and uniform-mass binning. As a consequence of our 'tripod' theorems, these confidence intervals for binned probabilities lead to distribution-free calibration.
arXiv Detail & Related papers (2020-06-18T14:17:29Z)
Stochastic Optimization for Performative Prediction [31.876692592395777]
We study the difference between merely updating model parameters and deploying the new model. We prove rates of convergence for both greedily deploying models after each update and for taking several updates before redeploying. They illustrate how depending on the strength of performative effects, there exists a regime where either approach outperforms the other.
arXiv Detail & Related papers (2020-06-12T00:31:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.