Related papers: Predicting Multi-Agent Specialization via Task Parallelizability

Related papers

TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings [26.532942920392376]
TSEmbed is a universal multimodal embedding framework that synergizes Mixture-of-Experts (MoE) with Low-Rank Adaptation (LoRA)<n>We introduce Expert-Aware Negative Sampling (EANS), a novel strategy that leverages expert routing distributions as an intrinsic proxy for semantic similarity.
arXiv Detail & Related papers (2026-03-05T03:43:52Z)
SMES: Towards Scalable Multi-Task Recommendation via Expert Sparsity [47.79376327982703]
Industrial recommender systems rely on multi-task learning to estimate diverse user feedback signals and aggregate them for ranking.<n>Recent advances in model scaling have shown promising gains in recommendation.<n>This mismatch between uniform parameter scaling and heterogeneous task capacity demands poses a fundamental challenge for scalable multi-task recommendation.
arXiv Detail & Related papers (2026-02-10T03:56:12Z)
MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation [64.2621682259008]
Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2)<n>We propose a Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2) to integrate policy learning with multi-agent tree search.<n>We show that MARTI-MARS2 achieves 77.7%, outperforming strong baselines like GPT-5.1 on challenging code generation benchmarks.
arXiv Detail & Related papers (2026-02-08T07:28:44Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
Parameter Aware Mamba Model for Multi-task Dense Prediction [69.94454603308196]
We introduce a novel decoder-based framework, Aware Mamba Model (PAMM), specifically designed for dense prediction in multi-task learning setting.<n>It features dual state space parameter experts that integrate and set task-specific parameter priors, capturing the intrinsic properties of each task.<n>We employ the Multi-Directional Hilbert Scanning method to construct multi-angle feature sequences, thereby enhancing the sequence model's perceptual capabilities for 2D data.
arXiv Detail & Related papers (2025-11-18T13:48:00Z)
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO [24.532870400949424]
Current training methods train a unified large language model for all agents in the system.<n>This may limit the performances due to different underlying distributions for different agents.<n>We propose M-GRPO, a hierarchical extension of Group Relative Policy Optimization for vertical Multi-agent systems.
arXiv Detail & Related papers (2025-11-17T12:06:30Z)
ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation [96.86211867758652]
Low-Rank Adaptation (LoRA) is widely adopted for downstream fine-tuning of foundation models.<n>We propose ThanoRA, a Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation framework.
arXiv Detail & Related papers (2025-05-24T11:01:45Z)
MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations [5.4482836906033585]
We study offline imitation learning (IL) in cooperative multi-agent settings, where demonstrations have unlabeled mixed quality.<n>Our proposed solution is structured in two stages: trajectory labeling and multi-agent imitation learning.<n>We introduce MisoDICE, a novel multi-agent IL algorithm that leverages these labels to learn robust policies.
arXiv Detail & Related papers (2025-05-24T08:43:42Z)
HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems [1.1930434318557155]
We introduce HALO, a multi-agent collaboration framework based on a hierarchical reasoning architecture.<n>Specifically, we incorporate a high-level planning agent for task decomposition, mid-level role-design agents for subtask-specific agent instantiation, and low-level inference agents for subtask execution.<n>As the majority of users lack expertise in prompt engineering, we leverage an Adaptive Prompt Refinement module to transform raw queries into task-specific prompts.
arXiv Detail & Related papers (2025-05-17T04:14:03Z)
Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning [76.10639521319382]
We propose Symbolic-MoE, a symbolic, text-based, and gradient-free Mixture-of-Experts framework. We show that Symbolic-MoE's instance-level expert selection improves performance by a large margin but -- when implemented naively -- can introduce a high computational overhead.
arXiv Detail & Related papers (2025-03-07T18:03:13Z)
Towards more Contextual Agents: An extractor-Generator Optimization Framework [0.0]
Large Language Model (LLM)-based agents have demonstrated remarkable success in solving complex tasks across a wide range of general-purpose applications.<n>However, their performance often degrades in context-specific scenarios, such as specialized industries or research domains.<n>To address this challenge, our work introduces a systematic approach to enhance the contextual adaptability of LLM-based agents.
arXiv Detail & Related papers (2025-02-18T15:07:06Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection. Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z)
Inverse Reinforcement Learning with Sub-optimal Experts [56.553106680769474]
We study the theoretical properties of the class of reward functions that are compatible with a given set of experts. Our results show that the presence of multiple sub-optimal experts can significantly shrink the set of compatible rewards. We analyze a uniform sampling algorithm that results in being minimax optimal whenever the sub-optimal experts' performance level is sufficiently close to the one of the optimal agent.
arXiv Detail & Related papers (2024-01-08T12:39:25Z)
Harnessing Pre-trained Generalist Agents for Software Engineering Tasks [13.733085206098258]
Deep reinforcement learning (DRL) has been successfully used for automation in complex tasks such as game testing and solving the job-shop scheduling problem. Specialist DRL agents suffer from a lack of generalizability to other tasks and need substantial time to be developed and re-trained effectively. Recently, DRL researchers have begun to develop generalist agents, able to learn a policy from various environments and capable of achieving performances similar to or better than specialist agents in new tasks.
arXiv Detail & Related papers (2023-12-24T18:39:58Z)
ALMA: Hierarchical Learning for Composite Multi-Agent Tasks [21.556661319375255]
We introduce ALMA, a general learning method for taking advantage of structured tasks. ALMA simultaneously learns a high-level subtask allocation policy and low-level agent policies. We demonstrate that ALMA learns sophisticated coordination behavior in a number of challenging environments.
arXiv Detail & Related papers (2022-05-27T19:12:23Z)
LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL. To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy. We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z)
Reward Machines for Cooperative Multi-Agent Reinforcement Learning [30.84689303706561]
In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal. We propose the use of reward machines (RM) -- Mealy machines used as structured representations of reward functions -- to encode the team's task. The proposed novel interpretation of RMs in the multi-agent setting explicitly encodes required teammate interdependencies, allowing the team-level task to be decomposed into sub-tasks for individual agents.
arXiv Detail & Related papers (2020-07-03T23:08:14Z)
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities. Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
Individual specialization in multi-task environments with multiagent reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents. Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing. We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.