Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
- URL: http://arxiv.org/abs/2505.01595v1
- Date: Fri, 02 May 2025 21:33:18 GMT
- Title: Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
- Authors: Liaoyaqi Wang, Zhengping Jiang, Anqi Liu, Benjamin Van Durme,
- Abstract summary: We present a state-of-the-art model for fine-grained probability estimation of propositions conditioned on context.<n>We show that our approach consistently outperforms existing fine-tuned and prompting-based methods by a large margin.
- Score: 37.950889606305836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a state-of-the-art model for fine-grained probability estimation of propositions conditioned on context. Recent advances in large language models (LLMs) have significantly enhanced their reasoning capabilities, particularly on well-defined tasks with complete information. However, LLMs continue to struggle with making accurate and well-calibrated probabilistic predictions under uncertainty or partial information. While incorporating uncertainty into model predictions often boosts performance, obtaining reliable estimates of that uncertainty remains understudied. In particular, LLM probability estimates tend to be coarse and biased towards more frequent numbers. Through a combination of human and synthetic data creation and assessment, scaling to larger models, and better supervision, we propose a set of strong and precise probability estimation models. We conduct systematic evaluations across tasks that rely on conditional probability estimation and show that our approach consistently outperforms existing fine-tuned and prompting-based methods by a large margin.
Related papers
- Generalised Probabilistic Modelling and Improved Uncertainty Estimation in Comparative LLM-as-a-judge [37.84914870036184]
We show that existing Product-of-Experts methods are specific cases of a broader framework, enabling diverse modelling options.<n>We propose improved uncertainty estimates for individual comparisons, enabling more efficient selection and achieving strong performance with fewer evaluations.
arXiv Detail & Related papers (2025-05-21T08:16:18Z) - Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs [12.489784979345654]
We show that current versions of large language models (LLMs) lack the ability to provide rational and coherent representations of probabilistic beliefs.<n>We apply well-established techniques for uncertainty quantification to measure the ability of LLM's to adhere to fundamental properties of probabilistic reasoning.
arXiv Detail & Related papers (2025-04-18T11:50:30Z) - A Probabilistic Perspective on Unlearning and Alignment for Large Language Models [48.96686419141881]
We introduce the first formal probabilistic evaluation framework for Large Language Models (LLMs)<n> Namely, we propose novel metrics with high probability guarantees concerning the output distribution of a model.<n>Our metrics are application-independent and allow practitioners to make more reliable estimates about model capabilities before deployment.
arXiv Detail & Related papers (2024-10-04T15:44:23Z) - BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models [52.46248487458641]
Predictive models often need to work with incomplete information in real-world tasks.<n>Current large language models (LLMs) are insufficient for accurate estimations.<n>We propose BIRD, a novel probabilistic inference framework.
arXiv Detail & Related papers (2024-04-18T20:17:23Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z) - Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree.
We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions.
By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z) - Creating Probabilistic Forecasts from Arbitrary Deterministic Forecasts
using Conditional Invertible Neural Networks [0.19573380763700712]
We use a conditional Invertible Neural Network (cINN) to learn the underlying distribution of the data and then combine the uncertainty from this distribution with an arbitrary deterministic forecast.
Our approach enables the simple creation of probabilistic forecasts without complicated statistical loss functions or further assumptions.
arXiv Detail & Related papers (2023-02-03T15:11:39Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2022-06-16T06:13:53Z) - Uncertainty estimation of pedestrian future trajectory using Bayesian
approximation [137.00426219455116]
Under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy.
The authors propose to quantify uncertainty during forecasting using approximation which deterministic approaches fail to capture.
The effect of dropout weights and long-term prediction on future state uncertainty has been studied.
arXiv Detail & Related papers (2022-05-04T04:23:38Z) - Probabilistic Deep Learning to Quantify Uncertainty in Air Quality
Forecasting [5.007231239800297]
This work applies state-of-the-art techniques of uncertainty quantification in a real-world setting of air quality forecasts.
We describe training probabilistic models and evaluate their predictive uncertainties based on empirical performance, reliability of confidence estimate, and practical applicability.
Our experiments demonstrate that the proposed models perform better than previous works in quantifying uncertainty in data-driven air quality forecasts.
arXiv Detail & Related papers (2021-12-05T17:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.