Related papers: Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation

Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation

URL: http://arxiv.org/abs/2601.16863v1
Date: Fri, 23 Jan 2026 16:11:54 GMT
Title: Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation
Authors: Tims Pecerskis, Aivars Smirnovs,
Abstract summary: This paper introduces the N-Way Self-Evaluating Deliberation (NSED) protocol, a Mixture-of-Models (MoM) architecture that constructs emergent composite models from a plurality of distinct expert agents.<n>Unlike traditional Mixture-of-Experts (MoE) which rely on static gating networks, NSED employs a Dynamic Expertise Broker that treats model selection as a variation of the Knapsack Problem.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces the N-Way Self-Evaluating Deliberation (NSED) protocol, a Runtime Mixture-of-Models (MoM) architecture that constructs emergent composite models from a plurality of distinct expert agents. Unlike traditional Mixture-of-Experts (MoE) which rely on static gating networks, NSED employs a Dynamic Expertise Broker - a runtime optimization engine that treats model selection as a variation of the Knapsack Problem, binding heterogeneous checkpoints to functional roles based on live telemetry and cost constraints. At the execution layer, we formalize deliberation as a Macro-Scale Recurrent Neural Network (RNN), where the consensus state loops back through a semantic forget gate to enable iterative refinement without proportional VRAM scaling. Key components include an orchestration fabric for trustless N-to-N peer review, a Quadratic Voting activation function for non-linear consensus, and a feedback-driven state update. Empirical validation on challenging benchmarks (AIME 2025, LiveCodeBench) demonstrates that this topology allows ensembles of small (less than 20B) consumer-grade models to match or exceed the performance of state-of-the-art 100B+ parameter models, establishing a new hardware arbitrage efficiency frontier. Furthermore, testing on the DarkBench safety suite reveals intrinsic alignment properties, with peer-mediated correction reducing sycophancy scores below that of any individual agent.

Related papers

Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models [39.290072292743226]
We introduce Mix-GRM, a framework that reconfigures raw rationales into structured B-CoT and D-CoT through a modular synthesis pipeline.<n>Experiments demonstrate that Mix-GRM establishes a new state-of-the-art across five benchmarks, surpassing leading open-source RMs by an average of 8.2%.
arXiv Detail & Related papers (2026-03-02T07:54:29Z)
CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks [54.04030169323115]
We introduce CREDIT, a certified ownership verification against Model Extraction Attacks (MEAs)<n>We quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold.<n>We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2026-02-23T23:36:25Z)
Guided Verifier: Collaborative Multimodal Reasoning via Dynamic Process Supervision [11.159231524113764]
Reinforcement Learning (RL) has emerged as a pivotal mechanism for enhancing the complex reasoning capabilities of Multimodal Large Language Models (MLLMs)<n>In this paper, we propose the textbfGuided Verifier framework to address these structural limitations.<n>We develop a specialized data synthesis pipeline targeting multimodal hallucinations, constructing textbfCoRe dataset of process-level negatives and textbfCorrect-guide textbfReasoning trajectories to train the guided verifier.
arXiv Detail & Related papers (2026-02-04T07:38:42Z)
SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models [53.19726629537694]
Post-training alignment of video generation models with human preferences is a critical goal.<n>Current data collection paradigms, reliant on in-prompt pairwise annotations, suffer from labeling noise.<n>We propose SoliReward, a systematic framework for video RM training.
arXiv Detail & Related papers (2025-12-17T14:28:23Z)
UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation [104.59740403500132]
Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance.<n>We propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC)<n>Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels.
arXiv Detail & Related papers (2025-09-19T17:29:25Z)
Towards Efficient General Feature Prediction in Masked Skeleton Modeling [59.46799426434277]
We propose a novel General Feature Prediction framework (GFP) for efficient mask skeleton modeling.<n>Our key innovation is replacing conventional low-level reconstruction with high-level feature prediction that spans from local motion patterns to global semantic representations.
arXiv Detail & Related papers (2025-09-03T18:05:02Z)
HEAS: Hierarchical Evolutionary Agent Simulation Framework for Cross-Scale Modeling and Multi-Objective Search [4.807104001943257]
Hierarchical Simulation Agent (HEAS) is a Python framework that unifies layered agent-based modeling with evolutionary optimization and tournament evaluation.<n>HEAS represents models as hierarchies of lightweight processes ("streams") scheduled in deterministic layers that read and write a shared context.<n> compact API and CLI-simulate, optimize, evaluate-expose single- and multi-objective evolution.
arXiv Detail & Related papers (2025-08-21T13:35:46Z)
SCoRE: Streamlined Corpus-based Relation Extraction using Multi-Label Contrastive Learning and Bayesian kNN [0.2812395851874055]
We introduce SCoRE, a modular and cost-effective sentence-level relation extraction system.<n>SCoRE enables easy PLM switching, requires no finetuning, and adapts smoothly to diverse corpora and KGs.<n>We show that SCoRE matches or surpasses state-of-the-art methods while significantly reducing energy consumption.
arXiv Detail & Related papers (2025-07-09T14:33:07Z)
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging [29.58798660724693]
Continual model merging integrates independently fine-tuned models sequentially without access to the original training data.<n>We propose MINGLE, a novel framework for Test-Time Continual Model Merging.<n> MINGLE achieves robust generalization, significantly reduces forgetting, and consistently surpasses previous state-of-the-art methods by 7-9% on average.
arXiv Detail & Related papers (2025-05-17T07:24:22Z)
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling. Our research explores task-specific model pruning to inform decisions about designing SMoE architectures. We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Towards Stable Co-saliency Detection and Object Co-segmentation [12.979401244603661]
We present a novel model for simultaneous stable co-saliency detection (CoSOD) and object co-segmentation (CoSEG) We first propose a multi-path stable recurrent unit (MSRU), containing dummy orders mechanisms (DOM) and recurrent unit (RU) Our proposed MSRU not only helps CoSOD (CoSEG) model captures robust inter-image relations, but also reduces order-sensitivity, resulting in a more stable inference and training process.
arXiv Detail & Related papers (2022-09-25T03:58:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.