DecipherPref: Analyzing Influential Factors in Human Preference
Judgments via GPT-4
- URL: http://arxiv.org/abs/2305.14702v3
- Date: Sat, 28 Oct 2023 01:03:15 GMT
- Title: DecipherPref: Analyzing Influential Factors in Human Preference
Judgments via GPT-4
- Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh,
Fei Liu
- Abstract summary: We conduct an in-depth examination of a collection of pairwise human judgments released by OpenAI.
We find that the most favored factors vary across tasks and genres, whereas the least favored factors tend to be consistent.
Our findings have implications on the construction of balanced datasets in human preference evaluations.
- Score: 28.661237196238996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human preference judgments are pivotal in guiding large language models
(LLMs) to produce outputs that align with human values. Human evaluations are
also used in summarization tasks to compare outputs from various systems,
complementing existing automatic metrics. Despite their significance, however,
there has been limited research probing these pairwise or $k$-wise comparisons.
The collective impact and relative importance of factors such as output length,
informativeness, fluency, and factual consistency are still not well
understood. It is also unclear if there are other hidden factors influencing
human judgments. In this paper, we conduct an in-depth examination of a
collection of pairwise human judgments released by OpenAI. Utilizing the
Bradley-Terry-Luce (BTL) model, we reveal the inherent preferences embedded in
these human judgments. We find that the most favored factors vary across tasks
and genres, whereas the least favored factors tend to be consistent, e.g.,
outputs are too brief, contain excessive off-focus content or hallucinated
facts. Our findings have implications on the construction of balanced datasets
in human preference evaluations, which is a crucial step in shaping the
behaviors of future LLMs.
Related papers
- How Aligned are Generative Models to Humans in High-Stakes Decision-Making? [10.225573060836478]
Large generative models (LMs) are increasingly being considered for high-stakes decision-making.
This work considers how such models compare to humans and predictive AI models on a specific case of recidivism prediction.
arXiv Detail & Related papers (2024-10-20T19:00:59Z) - Uncovering Factor Level Preferences to Improve Human-Model Alignment [58.50191593880829]
We introduce PROFILE, a framework that uncovers and quantifies the influence of specific factors driving preferences.
ProFILE's factor level analysis explains the 'why' behind human-model alignment and misalignment.
We demonstrate how leveraging factor level insights, including addressing misaligned factors, can improve alignment with human preferences.
arXiv Detail & Related papers (2024-10-09T15:02:34Z) - Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge [51.93909886542317]
We show how *relying on a single aggregate correlation score* can obscure fundamental differences between human behavior and automatic evaluation methods.
We propose stratifying results by human label uncertainty to provide a more robust analysis of automatic evaluation performance.
arXiv Detail & Related papers (2024-10-03T03:08:29Z) - AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment [37.985947029716016]
Large language models (LLMs) have shown advanced understanding capabilities but may inherit human biases from their training data.
We investigated whether LLMs are influenced by the threshold priming effect in relevance judgments.
arXiv Detail & Related papers (2024-09-24T12:23:15Z) - Investigating Context Effects in Similarity Judgements in Large Language Models [6.421776078858197]
Large Language Models (LLMs) have revolutionised the capability of AI models in comprehending and generating natural language text.
We report an ongoing investigation on alignment of LLMs with human judgements affected by order bias.
arXiv Detail & Related papers (2024-08-20T10:26:02Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable
Diffusion Model [69.12623428463573]
AlignDiff is a novel framework to quantify human preferences, covering abstractness, and guide diffusion planning.
It can accurately match user-customized behaviors and efficiently switch from one to another.
We demonstrate its superior performance on preference matching, switching, and covering compared to other baselines.
arXiv Detail & Related papers (2023-10-03T13:53:08Z) - Human Feedback is not Gold Standard [28.63384327791185]
We critically analyse the use of human feedback for both training and evaluation.
We find that while preference scores have fairly good coverage, they under-represent important aspects like factuality.
arXiv Detail & Related papers (2023-09-28T11:18:20Z) - Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)
We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.
Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z) - Perspectives on Large Language Models for Relevance Judgment [56.935731584323996]
Large language models (LLMs) claim that they can assist with relevance judgments.
It is not clear whether automated judgments can reliably be used in evaluations of retrieval systems.
arXiv Detail & Related papers (2023-04-13T13:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.