Multi-objective Large Language Model Alignment with Hierarchical Experts
- URL: http://arxiv.org/abs/2505.20925v1
- Date: Tue, 27 May 2025 09:15:03 GMT
- Title: Multi-objective Large Language Model Alignment with Hierarchical Experts
- Authors: Zhuo Li, Guodong Du, Weiyang Guo, Yigeng Zhou, Xiucheng Li, Wenya Wang, Fangming Liu, Yequan Wang, Deheng Ye, Min Zhang, Jing Li,
- Abstract summary: textitHoE consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing.<n>We evaluate textitHoE across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines.
- Score: 39.14442626829845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate \textit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.
Related papers
- Preference-based Multi-Objective Reinforcement Learning [5.031225669460861]
This paper introduces preference-based MORL (Pb-MORL), which formalizes the integration of preferences into the MORL framework.<n>To guide policy optimization using preferences, our method constructs a multi-objective reward model that aligns with the given preferences.<n>Experiments in benchmark multi-objective tasks, a multi-energy management task, and an autonomous driving task on a multi-line highway show that our method performs competitively.
arXiv Detail & Related papers (2025-07-18T16:43:04Z) - Interpretability by Design for Efficient Multi-Objective Reinforcement Learning [0.5524804393257919]
Multi-objective reinforcement learning (MORL) aims at optimising several, often conflicting goals in order to improve flexibility and reliability of RL in practical tasks.<n>This can be achieved by finding diverse policies that are optimal for some objective preferences and non-dominated by optimal policies for other preferences so that they form a Pareto front in the multi-objective performance space.
arXiv Detail & Related papers (2025-06-04T14:52:18Z) - Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes [54.93980123979578]
We introduce Latent Preference Coding (LPC), a novel framework that models the implicit factors as well as their combinations behind holistic preferences.<n>LPC seamlessly integrates with various offline alignment algorithms, automatically inferring the underlying factors and their importance from data.
arXiv Detail & Related papers (2025-05-08T06:59:06Z) - ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data [20.42976162135529]
Multiobjective alignment algorithms have shown strong performance and efficiency.<n>Inappropriate preference representations and training with imbalanced reward scores limit the performance of such algorithms.
arXiv Detail & Related papers (2025-04-23T11:35:57Z) - UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality [52.49062565901046]
Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone for aligning large language models with human values.<n>Existing approaches struggle to capture the multi-dimensional, distributional nuances of human preferences.<n>We introduce Utility-Conditioned Multi-Objective Alignment (UC-MOA), a novel framework that overcomes these limitations.
arXiv Detail & Related papers (2025-03-10T09:52:42Z) - Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning [76.10639521319382]
We propose Symbolic-MoE, a symbolic, text-based, and gradient-free Mixture-of-Experts framework.<n>We show that Symbolic-MoE's instance-level expert selection improves performance by a large margin but -- when implemented naively -- can introduce a high computational overhead.
arXiv Detail & Related papers (2025-03-07T18:03:13Z) - Robust Multi-Objective Preference Alignment with Online DPO [6.434799451791957]
Multi-objective preference alignment is critical for developing AI systems that are personalizable, helpful, and safe.<n>Existing approaches are either computationally expensive to train or do not sufficiently steer model behaviors.<n>This paper introduces the Multi-Objective Online DPO algorithm, designed to robustly and efficiently align model behaviors with multiple, potentially conflicting human preferences.
arXiv Detail & Related papers (2025-03-01T02:01:49Z) - MoSH: Modeling Multi-Objective Tradeoffs with Soft and Hard Bounds [29.347695311801864]
We introduce a novel conceptual framework that operationalizes soft-hard functions, SHFs, which allow for the DM to intuitively impose soft and hard bounds on each objective.<n>We show that many practical problems fit within the SHF framework and provide extensive empirical validation on diverse domains.<n>Specifically, for brachytherapy, our approach returns a compact set of points with over 3% greater SHF-defined utility than the next best approach.
arXiv Detail & Related papers (2024-12-09T02:32:20Z) - MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time [50.41806216615488]
Large Language Models (LLMs) acquire extensive knowledge and remarkable abilities from extensive text corpora.
To make LLMs more usable, aligning them with human preferences is essential.
We propose an effective method, textbf MetaAlign, which aims to help LLMs dynamically align with various explicit or implicit preferences specified at inference time.
arXiv Detail & Related papers (2024-10-18T05:31:13Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - T-REX: Mixture-of-Rank-One-Experts with Semantic-aware Intuition for Multi-task Large Language Model Finetuning [31.276142111455847]
Large language models (LLMs) encounter significant adaptation challenges in diverse multitask finetuning.<n>We design a novel framework, mixunderlinetextbfTureunderlinetextbf-of-underlinetextbfRank-onunderlinetextbfE-eunderlinetextbfXper ts (textttT-REX)<n>Rank-1 experts enable a mix-and-match mechanism to quadratically expand the vector subspace of experts with linear parameter overheads, achieving approximate error reduction with optimal
arXiv Detail & Related papers (2024-04-13T12:14:58Z) - Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization [76.09576643028362]
We present Multi-Objective Direct Preference Optimization (MODPO) for multiple alignment objectives.
MODPO folds language modeling directly into reward modeling, training language models as implicit collective reward models.
It theoretically yields the same optimal solutions as MORLHF but is practically more stable and efficient.
arXiv Detail & Related papers (2023-10-05T17:35:26Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Exact Pareto Optimal Search for Multi-Task Learning and Multi-Criteria
Decision-Making [10.914300987810128]
We show that EPO Search converges to an EPO solution at a linear rate of convergence.
We develop new algorithms: PESA-EPO for approximating the PF in a posteriori MCDM, and GP-EPO for elicitation in interactive MCDM.
EPO Search scales linearly with the number of variables, which enables its use for deep e-commerce networks.
arXiv Detail & Related papers (2021-08-02T02:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.