Related papers: Learning Parametric Distributions from Samples and Preferences

Learning Parametric Distributions from Samples and Preferences

URL: http://arxiv.org/abs/2505.23557v1
Date: Thu, 29 May 2025 15:33:43 GMT
Title: Learning Parametric Distributions from Samples and Preferences
Authors: Marc Jourdan, Gizem Yüce, Nicolas Flammarion,
Abstract summary: We show that preference-based M-estimators achieve a better variance than sample-only M-estimators.<n>We propose an estimator achieving an estimation error scaling of $mathcalO (1/n)$ -- a significant improvement over the $Theta (1/sqrtn)$ rate attainable with samples alone.
Score: 19.879505582147807
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Recent advances in language modeling have underscored the role of preference feedback in enhancing model performance. This paper investigates the conditions under which preference feedback improves parameter estimation in classes of continuous parametric distributions. In our framework, the learner observes pairs of samples from an unknown distribution along with their relative preferences depending on the same unknown parameter. We show that preference-based M-estimators achieve a better asymptotic variance than sample-only M-estimators, further improved by deterministic preferences. Leveraging the hard constraints revealed by deterministic preferences, we propose an estimator achieving an estimation error scaling of $\mathcal{O}(1/n)$ -- a significant improvement over the $\Theta(1/\sqrt{n})$ rate attainable with samples alone. Next, we establish a lower bound that matches this accelerated rate; up to dimension and problem-dependent constants. While the assumptions underpinning our analysis are restrictive, they are satisfied by notable cases such as Gaussian or Laplace distributions for preferences based on the log-probability reward.

Related papers

Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights. The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset. Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z)
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models [56.31310344616837]
Thompson sampling (TS) has been known for its outstanding empirical performance supported by theoretical guarantees across various reward models. This study explores the impact of selecting noninformative priors, offering insights into the performance of TS when dealing with new models that lack theoretical understanding.
arXiv Detail & Related papers (2023-02-28T08:42:42Z)
Bayesian Hierarchical Models for Counterfactual Estimation [12.159830463756341]
We propose a probabilistic paradigm to estimate a diverse set of counterfactuals. We treat the perturbations as random variables endowed with prior distribution functions. A gradient based sampler with superior convergence characteristics efficiently computes the posterior samples.
arXiv Detail & Related papers (2023-01-21T00:21:11Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Reliable Categorical Variational Inference with Mixture of Discrete Normalizing Flows [10.406659081400354]
Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling. Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations. In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one.
arXiv Detail & Related papers (2020-06-28T10:39:39Z)
Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models. We provide a unifying view of these estimators under the framework of regularized nonparametric regression. We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z)
SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions. We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.