Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making
- URL: http://arxiv.org/abs/2510.21788v1
- Date: Sun, 19 Oct 2025 11:00:10 GMT
- Title: Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making
- Authors: Larkin Liu, Jalal Etesami,
- Abstract summary: We explore the use of expert-guided bandit learning, which we refer to as online mixture-of-experts (OMoE)<n>The first algorithm combines aggregate voting with UCB-driven successive elimination, efficiently pruning suboptimal exploration actions.<n>The second algorithm employs an online weighted-majority-voting mechanism, leveraging the respective voting power of each expert proportional to their predictive power.
- Score: 5.4873584968147275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We explore the use of expert-guided bandit learning, which we refer to as online mixture-of-experts (OMoE). In this setting, given a context, a candidate committee of experts must determine how to aggregate their outputs to achieve optimal results in terms of aggregate accuracy. We propose two algorithms to address this problem. The first algorithm combines aggregate voting with UCB-driven successive elimination, efficiently pruning suboptimal exploration actions. The second algorithm employs an online weighted-majority-voting mechanism, leveraging the respective voting power of each expert proportional to their predictive power. We derive theoretical guarantees for the regret properties in the bandit setting under ideal circumstances, and empirical results are provided accordingly. As a modern study on applications, these methods are applied to the online fine-tuning of a set of expert large language models (LLMs), where after each response, the generative LLM dynamically reweighs its set of experts and/or selects the optimal committee of experts to generate the most accurate response. Our results introduce new methodologies and no-regret guarantees for combining multiple experts to improve on the performance of the an aggregate model overall.
Related papers
- Imitation Learning for Combinatorial Optimisation under Uncertainty [1.0781866671930855]
This paper introduces a systematic taxonomy of experts for IL optimisation under uncertainty.<n>Experts are classified along three dimensions: (i) their treatment of uncertainty, including myopic, deterministic, full-information, two-stage, and multi-stage formulations; (ii) their level of optimality, distinguishing task-optimal and approximate experts; and (iii) their interaction mode with the learner, ranging from one-shot supervision to iterative, interactive schemes.
arXiv Detail & Related papers (2026-01-08T21:16:25Z) - Beyond Majority Voting: LLM Aggregation by Leveraging Higher-Order Information [57.397381631496906]
We develop two new aggregation algorithms called Optimal Weight (OW) and Inverse Surprising Popularity (ISP)<n>Our theoretical analysis shows these methods provably mitigate inherent limitations of majority voting under mild assumptions.<n>We empirically validate our algorithms on synthetic datasets, popular LLM fine-tuning benchmarks such as UltraFeedback and MMLU, and a real-world healthcare setting ARMMAN.
arXiv Detail & Related papers (2025-10-01T22:21:50Z) - Conformal Set-based Human-AI Complementarity with Multiple Experts [1.1510009152620668]
This study focuses on the selection of instance-specific experts from a pool of multiple human experts.<n>We introduce a greedy algorithm that utilizes conformal sets to identify the subset of expert predictions that will be used in classifying an instance.
arXiv Detail & Related papers (2025-08-09T14:17:51Z) - ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses [0.0]
ExpertRAG is a novel theoretical framework that integrates Mixture-of-Experts (MoE) architectures with Retrieval Augmented Generation (RAG)<n>We propose a dynamic retrieval gating mechanism coupled with expert routing, enabling the model to selectively consult an external knowledge store or rely on specialized internal experts.<n>We derive formulae to quantify the expected computational cost savings from selective retrieval and the capacity gains from sparse expert utilization.
arXiv Detail & Related papers (2025-03-23T17:26:23Z) - Evaluating Ensemble Methods for News Recommender Systems [50.90330146667386]
This paper demonstrates how ensemble methods can be used to combine many diverse state-of-the-art algorithms to achieve superior results on the Microsoft News dataset (MIND)
Our findings demonstrate that a combination of NRS algorithms can outperform individual algorithms, provided that the base learners are sufficiently diverse.
arXiv Detail & Related papers (2024-06-23T13:40:50Z) - Inverse Reinforcement Learning with Sub-optimal Experts [56.553106680769474]
We study the theoretical properties of the class of reward functions that are compatible with a given set of experts.
Our results show that the presence of multiple sub-optimal experts can significantly shrink the set of compatible rewards.
We analyze a uniform sampling algorithm that results in being minimax optimal whenever the sub-optimal experts' performance level is sufficiently close to the one of the optimal agent.
arXiv Detail & Related papers (2024-01-08T12:39:25Z) - Expert-guided Bayesian Optimisation for Human-in-the-loop Experimental
Design of Known Systems [0.0]
We apply high- throughput (batch) Bayesian optimisation alongside anthropological decision theory to enable domain experts to influence the selection of optimal experiments.
Our methodology exploits the hypothesis that humans are better at making discrete choices than continuous ones and enables experts to influence critical early decisions.
arXiv Detail & Related papers (2023-12-05T16:09:31Z) - Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy [84.11508381847929]
Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks.
We propose M-SMoE, which leverages routing statistics to guide expert merging.
Our MC-SMoE achieves up to 80% memory and a 20% FLOPs reduction, with virtually no loss in performance.
arXiv Detail & Related papers (2023-10-02T16:51:32Z) - Active Ranking of Experts Based on their Performances in Many Tasks [72.96112117037465]
We consider the problem of ranking n experts based on their performances on d tasks.
We make a monotonicity assumption stating that for each pair of experts, one outperforms the other on all tasks.
arXiv Detail & Related papers (2023-06-05T06:55:39Z) - Optimal Tracking in Prediction with Expert Advice [0.0]
We study the prediction with expert advice setting, where the aim is to produce a decision by combining the decisions generated by a set of experts.
We achieve the min-max optimal dynamic regret under the prediction with expert advice setting.
Our algorithms are the first to produce such universally optimal, adaptive and truly online guarantees with no prior knowledge.
arXiv Detail & Related papers (2022-08-07T12:29:54Z) - Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.