Power and Limitations of Aggregation in Compound AI Systems
- URL: http://arxiv.org/abs/2602.21556v1
- Date: Wed, 25 Feb 2026 04:23:50 GMT
- Title: Power and Limitations of Aggregation in Compound AI Systems
- Authors: Nivasini Ananthakrishnan, Meena Jagadeesan,
- Abstract summary: We investigate the power and limitations of aggregation within a stylized principal-agent framework.<n>Our analysis uncovers three natural mechanisms -- feasibility expansion, support expansion, and binding set contraction.<n>Our results take a step towards characterizing when compound AI systems can overcome limitations in model capabilities and in prompt engineering.
- Score: 10.867699486308197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When designing compound AI systems, a common approach is to query multiple copies of the same model and aggregate the responses to produce a synthesized output. Given the homogeneity of these models, this raises the question of whether aggregation unlocks access to a greater set of outputs than querying a single model. In this work, we investigate the power and limitations of aggregation within a stylized principal-agent framework. This framework models how the system designer can partially steer each agent's output through its reward function specification, but still faces limitations due to prompt engineering ability and model capabilities. Our analysis uncovers three natural mechanisms -- feasibility expansion, support expansion, and binding set contraction -- through which aggregation expands the set of outputs that are elicitable by the system designer. We prove that any aggregation operation must implement one of these mechanisms in order to be elicitability-expanding, and that strengthened versions of these mechanisms provide necessary and sufficient conditions that fully characterize elicitability-expansion. Finally, we provide an empirical illustration of our findings for LLMs deployed in a toy reference-generation task. Altogether, our results take a step towards characterizing when compound AI systems can overcome limitations in model capabilities and in prompt engineering.
Related papers
- Quantifying Model Uniqueness in Heterogeneous AI Ecosystems [1.1162481475388237]
We introduce a statistical framework for auditing model uniqueness based on In-Silico Quasi-Experimental Design.<n>By enforcing matched interventions across models, we isolate intrinsic model identity and quantify uniqueness as the Peer-Inexpressible Residual (PIER)<n>These results move trustworthy AI beyond explaining single models.
arXiv Detail & Related papers (2026-01-30T13:41:53Z) - Multi-Agent Constraint Factorization Reveals Latent Invariant Solution Structure [0.0]
Multi-agent systems (MAS) composed of large language models often exhibit improved problem-solving performance despite operating on identical information.<n>We model each agent as enforcing a distinct family of validity constraints on a shared solution state, and show that a MAS implements a factorized composition of constraint-enforcement operators.<n>We extend this result from exact constraint enforcement to soft constraints via proximal operators, and apply the formalism to contemporary text-based dialog systems.
arXiv Detail & Related papers (2026-01-21T15:23:04Z) - The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models [54.51795784459866]
We propose a theoretical framework of performance scaling for multi-model collaboration.<n>We show that multi-model systems follow a power-law scaling with respect to the total parameter count.<n> ensembles of heterogeneous model families achieve better performance scaling than those formed within a single model family.
arXiv Detail & Related papers (2025-12-29T09:55:12Z) - Understanding and Harnessing Sparsity in Unified Multimodal Models [32.09095929575726]
Large multimodal models have achieved remarkable progress in both understanding and generation.<n>Recent efforts pursue unified multimodal models that integrate heterogeneous components to support both capabilities within a single framework.<n>Yet, a systematic understanding of how these inefficiencies manifest across different components remains limited.
arXiv Detail & Related papers (2025-12-02T02:47:29Z) - From monoliths to modules: Decomposing transducers for efficient world modelling [74.41506965793417]
We develop a framework for decomposing complex world models represented by transducers.<n>Our results clarify how to invert this process, deriving sub-transducers operating on distinct input-output subspaces.
arXiv Detail & Related papers (2025-12-01T20:37:43Z) - Experts are all you need: A Composable Framework for Large Language Model Inference [8.747592414164687]
Large Language Models (LLMs) have achieved state-of-the-art accuracies in a variety of natural language processing (NLP) tasks.<n>MoEs overcome this bottleneck by decoupling model capacity from computation by only activating a subset of parameters or "experts"
arXiv Detail & Related papers (2025-11-28T08:00:16Z) - An Integrated Fusion Framework for Ensemble Learning Leveraging Gradient Boosting and Fuzzy Rule-Based Models [59.13182819190547]
Fuzzy rule-based models excel in interpretability and have seen widespread application across diverse fields.<n>They face challenges such as complex design specifications and scalability issues with large datasets.<n>This paper proposes an Integrated Fusion Framework that merges the strengths of both paradigms to enhance model performance and interpretability.
arXiv Detail & Related papers (2025-11-11T10:28:23Z) - NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching [64.10695425442164]
We introduce NExT-OMNI, an open-source omnimodal foundation model that achieves unified modeling through discrete flow paradigms.<n>Trained on large-scale interleaved text, image, video, and audio data, NExT-OMNI delivers competitive performance on multimodal generation and understanding benchmarks.<n>To advance further research, we release training details, data protocols, and open-source both the code and model checkpoints.
arXiv Detail & Related papers (2025-10-15T16:25:18Z) - Tractable Asymmetric Verification for Large Language Models via Deterministic Replicability [0.6117371161379209]
The landscape of Large Language Models (LLMs) shifts rapidly towards dynamic, multi-agent systems.<n>This paper proposes a verification framework that achieves tractable asymmetric effort.<n>We show that targeted verification can be over 12 times faster than full regeneration.
arXiv Detail & Related papers (2025-09-14T03:30:06Z) - Assemble Your Crew: Automatic Multi-agent Communication Topology Design via Autoregressive Graph Generation [91.17994756436259]
Multi-agent systems (MAS) based on large language models (LLMs) have emerged as a powerful solution for dealing with complex problems across diverse domains.<n>Existing approaches are fundamentally constrained by their reliance on a template graph modification paradigm with a predefined set of agents and hard-coded interaction structures.<n>We propose ARG-Designer, a novel autoregressive model that operationalizes this paradigm by constructing the collaboration graph from scratch.
arXiv Detail & Related papers (2025-07-24T09:17:41Z) - SE-Merging: A Self-Enhanced Approach for Dynamic Model Merging [60.83635006372403]
textttSE-Merging is a self-enhanced model merging framework.<n>We show that textttSE-Merging achieves dynamic model merging without additional training.
arXiv Detail & Related papers (2025-06-22T18:38:41Z) - HeterRec: Heterogeneous Information Transformer for Scalable Sequential Recommendation [21.435064492654494]
HeterRec is a sequential recommendation model that integrates item-side heterogeneous features.<n>HeterRec incorporates Heterogeneous Token Flatten Layer (HTFL) and Hierarchical Causal Transformer Layer (HCT)<n>Extensive experiments on both offline and online datasets show that the HeterRec model achieves superior performance.
arXiv Detail & Related papers (2025-03-03T12:23:54Z) - Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers [12.986126243018452]
transformers are structurally incapable of representing linear or additive surrogate models used for feature attribution.<n>We introduce the Softmax-Linked Additive Log Odds Model (SLALOM), a novel surrogate model specifically designed to align with the transformer framework.<n>We highlight SLALOM's unique efficiency-quality curve by showing that SLALOM can produce explanations with substantially higher fidelity than competing surrogate models.
arXiv Detail & Related papers (2024-05-22T11:14:00Z) - Probabilistic ML Verification via Weighted Model Integration [11.812078181471634]
Probability formal verification (PFV) of machine learning models is in its infancy.
We propose a unifying framework for the PFV of ML systems based on Weighted Model Integration (WMI)
We show how successful scaling techniques in the ML verification literature can be generalized beyond their original scope.
arXiv Detail & Related papers (2024-02-07T14:24:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.