No Preference Left Behind: Group Distributional Preference Optimization
- URL: http://arxiv.org/abs/2412.20299v1
- Date: Sat, 28 Dec 2024 23:30:47 GMT
- Title: No Preference Left Behind: Group Distributional Preference Optimization
- Authors: Binwei Yao, Zefan Cai, Yun-Shiuan Chuang, Shanglin Yang, Ming Jiang, Diyi Yang, Junjie Hu,
- Abstract summary: Group Distribution Preference Optimization (GDPO) is a novel framework that aligns language models with the distribution of preferences within a group.
GDPO calibrates a language model using statistical estimation of the group's belief distribution.
GDPO consistently reduces this alignment gap during training.
- Score: 46.98320272443297
- License:
- Abstract: Preferences within a group of people are not uniform but follow a distribution. While existing alignment methods like Direct Preference Optimization (DPO) attempt to steer models to reflect human preferences, they struggle to capture the distributional pluralistic preferences within a group. These methods often skew toward dominant preferences, overlooking the diversity of opinions, especially when conflicting preferences arise. To address this issue, we propose Group Distribution Preference Optimization (GDPO), a novel framework that aligns language models with the distribution of preferences within a group by incorporating the concept of beliefs that shape individual preferences. GDPO calibrates a language model using statistical estimation of the group's belief distribution and aligns the model with belief-conditioned preferences, offering a more inclusive alignment framework than traditional methods. In experiments using both synthetic controllable opinion generation and real-world movie review datasets, we show that DPO fails to align with the targeted belief distributions, while GDPO consistently reduces this alignment gap during training. Moreover, our evaluation metrics demonstrate that GDPO outperforms existing approaches in aligning with group distributional preferences, marking a significant advance in pluralistic alignment.
Related papers
- Calibrated Multi-Preference Optimization for Aligning Diffusion Models [92.90660301195396]
Calibrated Preference Optimization (CaPO) is a novel method to align text-to-image (T2I) diffusion models.
CaPO incorporates the general preference from multiple reward models without human annotated data.
Experimental results show that CaPO consistently outperforms prior methods.
arXiv Detail & Related papers (2025-02-04T18:59:23Z) - VPO: Leveraging the Number of Votes in Preference Optimization [5.200545764106177]
We introduce a technique that leverages user voting data to better align with diverse subjective preferences.
We develop the Vote-based Preference Optimization framework, which incorporates the number of votes on both sides to distinguish between controversial and obvious generation pairs.
arXiv Detail & Related papers (2024-10-30T10:39:34Z) - ComPO: Community Preferences for Language Model Personalization [122.54846260663922]
ComPO is a method to personalize preference optimization in language models.
We collect and release ComPRed, a question answering dataset with community-level preferences from Reddit.
arXiv Detail & Related papers (2024-10-21T14:02:40Z) - Preference Alignment Improves Language Model-Based TTS [76.70693823683091]
preference alignment algorithms adjust LMs to align with the preferences of reward models, enhancing the desirability of the generated content.
With a 1.15B parameter LM-based TTS model, we demonstrate that preference alignment consistently improves intelligibility, speaker similarity, and proxy subjective evaluation scores.
arXiv Detail & Related papers (2024-09-19T01:58:19Z) - Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC.
We increase the consistency and informativeness of the pairwise preference signals through targeted modifications.
We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z) - Pareto-Optimal Learning from Preferences with Hidden Context [17.590330740964266]
We propose POPL, which enables pluralistic alignment by framing discrepant group preferences as objectives with potential trade-offs.
Our theoretical and empirical evaluations demonstrate that POPL surpasses baseline methods in learning sets of reward functions and policies.
We illustrate that POPL can also serve as a foundation for techniques optimizing specific notions of group fairness.
arXiv Detail & Related papers (2024-06-21T18:57:38Z) - Group Robust Preference Optimization in Reward-free RLHF [23.622835830345725]
We propose a novel Group Robust Preference Optimization (GRPO) method to align large language models to individual groups' preferences robustly.
To achieve this, GRPO adaptively and sequentially weights the importance of different groups, prioritizing groups with worse cumulative loss.
We significantly improved performance for the worst-performing groups, reduced loss imbalances across groups, and improved probability accuracies.
arXiv Detail & Related papers (2024-05-30T17:50:04Z) - Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization [105.3612692153615]
We propose a new axis based on eliciting preferences jointly over instruction-response pairs.
Joint preferences over instruction and response pairs can significantly enhance the alignment of large language models.
arXiv Detail & Related papers (2024-03-31T02:05:40Z) - Aligning Crowd Feedback via Distributional Preference Reward Modeling [28.754532173765686]
We propose the Distributional Preference Reward Model (DPRM) to align large language models with diverse human preferences.
Our experiments show that DPRM significantly enhances the alignment of LLMs with population preference, yielding more accurate, unbiased, and contextually appropriate responses.
arXiv Detail & Related papers (2024-02-15T07:29:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.