DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
- URL: http://arxiv.org/abs/2507.23261v1
- Date: Thu, 31 Jul 2025 05:52:30 GMT
- Title: DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
- Authors: Hui Yi Leong, Yuqing Wu,
- Abstract summary: DynaSwarm is a dynamic framework that enhances multi-agent systems.<n>It uses an actor-critic reinforcement learning mechanism to optimize graph structures.<n>It also has a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample.
- Score: 0.276240219662896
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Current multi-agent systems (MAS) frameworks often rely on manually designed and static collaboration graph structures, limiting adaptability and performance. To address these limitations, we propose DynaSwarm, a dynamic framework that enhances LLM-based MAS through two key innovations: (1) an actor-critic reinforcement learning (A2C) mechanism to optimize graph structures with improved stability over prior RL methods, and (2) a dynamic graph selector that adaptively chooses the optimal graph structure for each input sample via parameter-efficient LLM fine-tuning. DynaSwarm eliminates the need for rigid, one-fits-all graph architectures, instead leveraging sample-specific idiosyncrasies to dynamically route queries through specialized agent networks. (c) We propose to fine-tune the demonstration retriever to fully exploit the power of in-context learning (ICL). Extensive experiments on question answering, mathematical reasoning, and coding tasks demonstrate that DynaSwarm consistently outperforms state-of-the-art single-agent and MAS baselines across multiple LLM backbones. Our findings highlight the importance of sample-aware structural flexibility in LLM MAS designs.
Related papers
- ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism [9.93378263858092]
Multimodal large language models (MLLMs) handle images, videos, and audio by incorporating feature extractors and projection modules.<n>Current tightly coupled serving architectures struggle to distinguish between mixed request types.<n>We propose Elastic Multimodal Parallelism (EMP), a new serving paradigm that elastically adapts to resource heterogeneity.
arXiv Detail & Related papers (2025-07-14T08:53:48Z) - Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques [14.892995952768352]
Language Models (LMs) have excelled at tasks like text generation, summarization, and question answering.<n>Their inference remains computationally expensive and energy intensive in settings with limited hardware, power, or bandwidth.<n>Recent approaches have introduced multi LLM intelligent model selection strategies that dynamically allocate computational resources based on query complexity.
arXiv Detail & Related papers (2025-06-06T23:13:08Z) - SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling [16.742839354514512]
We introduce SkipGPT, a dynamic layer pruning framework to optimize large language models.<n>We show that SkipGPT reduces over 40% of model parameters while matching or exceeding the performance of the original dense model.
arXiv Detail & Related papers (2025-06-04T17:26:31Z) - Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency [56.475612147721264]
We propose a dual-reward formulation that supervises both semantic and temporal reasoning through discrete and continuous reward signals.<n>We evaluate our approach across eight representative video understanding tasks, including VideoQA, Temporal Video Grounding, and Grounded VideoQA.<n>Results underscore the importance of reward design and data selection in advancing reasoning-centric video understanding with MLLMs.
arXiv Detail & Related papers (2025-06-02T17:28:26Z) - Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z) - SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models [11.670056503731905]
We introduce SEKI, a novel large language model (LLM)-based neural architecture search (NAS) method.<n>Inspired by the chain-of-thought (CoT) paradigm in modern LLMs, SEKI operates in two key stages: self-evolution and knowledge distillation.
arXiv Detail & Related papers (2025-02-27T09:17:49Z) - Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems [102.36545569092777]
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights.<n>Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks.
arXiv Detail & Related papers (2025-02-06T21:27:11Z) - Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [115.79349923044663]
Few-shot class-incremental learning (FSCIL) aims to incrementally learn novel classes from limited examples.<n>Existing methods face a critical dilemma: static architectures rely on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session.<n>In this study, we explore the potential of Selective State Space Models (SSMs) for FSCIL.
arXiv Detail & Related papers (2024-07-08T17:09:39Z) - Flextron: Many-in-One Flexible Large Language Model [85.93260172698398]
We introduce Flextron, a network architecture and post-training model optimization framework supporting flexible model deployment.
We present a sample-efficient training method and associated routing algorithms for transforming an existing trained LLM into a Flextron model.
We demonstrate superior performance over multiple end-to-end trained variants and other state-of-the-art elastic networks, all with a single pretraining run that consumes a mere 7.63% tokens compared to original pretraining.
arXiv Detail & Related papers (2024-06-11T01:16:10Z) - CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion [58.15403987979496]
CREMA is a generalizable, highly efficient, and modular modality-fusion framework for video reasoning.<n>We propose a novel progressive multimodal fusion design supported by a lightweight fusion module and modality-sequential training strategy.<n>We validate our method on 7 video-language reasoning tasks assisted by diverse modalities, including VideoQA and Video-Audio/3D/Touch/Thermal QA.
arXiv Detail & Related papers (2024-02-08T18:27:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.