Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
- URL: http://arxiv.org/abs/2410.00847v1
- Date: Tue, 1 Oct 2024 16:29:59 GMT
- Title: Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
- Authors: Xingzhou Lou, Dong Yan, Wei Shen, Yuzi Yan, Jian Xie, Junge Zhang,
- Abstract summary: We propose Uncertain-aware RM (URM) and Uncertain-aware RM Ensemble (URME) to incorporate and manage uncertainty in reward modeling.
URM can model the distribution of disentangled attributes within human preferences, while URME quantifies uncertainty through discrepancies in the ensemble.
Experiment results indicate that the proposed URM achieves state-of-the-art performance compared to models with the same size.
- Score: 20.753374166695494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reward models (RM) play a critical role in aligning generations of large language models (LLM) to human expectations. However, prevailing RMs fail to capture the stochasticity within human preferences and cannot effectively evaluate the reliability of reward predictions. To address these issues, we propose Uncertain-aware RM (URM) and Uncertain-aware RM Ensemble (URME) to incorporate and manage uncertainty in reward modeling. URM can model the distribution of disentangled attributes within human preferences, while URME quantifies uncertainty through discrepancies in the ensemble, thereby identifying potential lack of knowledge during reward evaluation. Experiment results indicate that the proposed URM achieves state-of-the-art performance compared to models with the same size, demonstrating the effectiveness of modeling uncertainty within human preferences. Furthermore, empirical results show that through uncertainty quantification, URM and URME can identify unreliable predictions to improve the quality of reward evaluations.
Related papers
- Beyond RMSE and MAE: Introducing EAUC to unmask hidden bias and unfairness in dyadic regression models [5.336076422485076]
We show that non-uniformity in the observed value distributions of individual entities leads to severely biased predictions in state-of-the-art models.
We introduce Eccentricity-Area Under the Curve (EAUC) as a new metric that can quantify it in all studied models and datasets.
arXiv Detail & Related papers (2024-01-19T13:41:08Z) - Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree.
We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions.
By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z) - Training, Architecture, and Prior for Deterministic Uncertainty Methods [33.45069308137142]
This work investigates important design choices in Deterministic Uncertainty Methods (DUMs)
We show that training schemes decoupling the core architecture and the uncertainty head schemes can significantly improve uncertainty performances.
Contrary to other Bayesian models, we show that the prior defined by DUMs do not have a strong effect on the final performances.
arXiv Detail & Related papers (2023-03-10T09:00:52Z) - Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation [59.500347564280204]
We propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework.
AUR consists of a new uncertainty estimator along with a normal recommender model.
As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty.
arXiv Detail & Related papers (2022-09-22T04:32:51Z) - Uncertainty-Driven Action Quality Assessment [67.20617610820857]
We propose a novel probabilistic model, named Uncertainty-Driven AQA (UD-AQA), to capture the diversity among multiple judge scores.
We generate the estimation of uncertainty for each prediction, which is employed to re-weight AQA regression loss.
Our proposed method achieves competitive results on three benchmarks including the Olympic events MTL-AQA and FineDiving, and the surgical skill JIGSAWS datasets.
arXiv Detail & Related papers (2022-07-29T07:21:15Z) - Approaching Neural Network Uncertainty Realism [53.308409014122816]
Quantifying or at least upper-bounding uncertainties is vital for safety-critical systems such as autonomous vehicles.
We evaluate uncertainty realism -- a strict quality criterion -- with a Mahalanobis distance-based statistical test.
We adopt it to the automotive domain and show that it significantly improves uncertainty realism compared to a plain encoder-decoder model.
arXiv Detail & Related papers (2021-01-08T11:56:12Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z) - Model Uncertainty Quantification for Reliable Deep Vision Structural
Health Monitoring [2.5126058470073263]
This paper proposes Bayesian inference for deep vision structural health monitoring models.
Uncertainty can be quantified using the Monte Carlo dropout sampling.
Three independent case studies for cracks, local damage identification, and bridge component detection are investigated.
arXiv Detail & Related papers (2020-04-10T17:54:10Z) - Uncertainty-Gated Stochastic Sequential Model for EHR Mortality
Prediction [6.170898159041278]
We present a novel variational recurrent network that estimates the distribution of missing variables, updates hidden states, and predicts the possibility of in-hospital mortality.
It is noteworthy that our model can conduct these procedures in a single stream and learn all network parameters jointly in an end-to-end manner.
arXiv Detail & Related papers (2020-03-02T04:41:28Z) - Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error.
We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.