CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning
- URL: http://arxiv.org/abs/2602.24142v1
- Date: Fri, 27 Feb 2026 16:19:45 GMT
- Title: CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning
- Authors: Yuxuan Liu, Weikai Xu, Kun Huang, Changyu Chen, Jiankun Zhao, Pengzhi Gao, Wei Liu, Jian Luan, Shuo Shang, Bo Du, Ji-Rong Wen, Rui Yan,
- Abstract summary: Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning.<n>We propose Channel-of-Mobile-Experts (CoME), a novel agent architecture consisting of four distinct experts.<n>Experiments show that CoME outperforms dense mobile agents and MoE methods on both AITZ and AMEX datasets.
- Score: 97.4254365377865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning, including screen summary, subtask planning, action decision and action function. However, existing agents struggle to achieve both decoupled enhancement and balanced integration of these capabilities. To address these challenges, we propose Channel-of-Mobile-Experts (CoME), a novel agent architecture consisting of four distinct experts, each aligned with a specific reasoning stage, CoME activates the corresponding expert to generate output tokens in each reasoning stage via output-oriented activation. To empower CoME with hybrid-capabilities reasoning, we introduce a progressive training strategy: Expert-FT enables decoupling and enhancement of different experts' capability; Router-FT aligns expert activation with the different reasoning stage; CoT-FT facilitates seamless collaboration and balanced optimization across multiple capabilities. To mitigate error propagation in hybrid-capabilities reasoning, we propose InfoGain-Driven DPO (Info-DPO), which uses information gain to evaluate the contribution of each intermediate step, thereby guiding CoME toward more informative reasoning. Comprehensive experiments show that CoME outperforms dense mobile agents and MoE methods on both AITZ and AMEX datasets.
Related papers
- Evolving Interdependent Operators with Large Language Models for Multi-Objective Combinatorial Optimization [21.70371026963599]
Multi-operator optimization in MOEAs is formulated as a Markov decision process.<n>E2OC consistently outperforms state-of-the-art AHD and other multi-heuristic co-design frameworks.
arXiv Detail & Related papers (2026-01-25T16:31:07Z) - Agentic AI Reasoning for Mobile Edge General Intelligence: Fundamentals, Approaches, and Directions [74.35421055079655]
Large language models (LLMs) have enabled an emergence of agentic artificial intelligence (AI) with powerful reasoning and autonomous decision-making capabilities.<n>Mobile Edge General Intelligence (MEGI) brings real-time, privacy-preserving reasoning to the network edge.<n>We propose a joint optimization framework for efficient LLM reasoning deployment in MEGI.
arXiv Detail & Related papers (2025-09-27T10:53:48Z) - MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models [52.876185634349575]
We propose to incorporate Mixture of Intra- and Inter-Modality Experts (MoIIE) to Large Vision-Language Models (LVLMs)<n>For each token, expert routing is guided by its modality, directing tokens to their respective intra-modality experts as well as a shared pool of inter-modality experts.<n>Our MoIIE models with 5.5B and 11.3B activated parameters match or even surpass the performance of existing advanced open-source MoE-LLMs based multi-modal models.
arXiv Detail & Related papers (2025-08-13T13:00:05Z) - Multi-Task Dense Prediction Fine-Tuning with Mixture of Fine-Grained Experts [22.936728143586443]
Multi-task learning (MTL) for dense prediction has shown promising results but still faces challenges in balancing shared representations with task-specific specialization.<n>We introduce a novel Fine-Grained Mixture of Experts architecture that explores MoE-based MTL models through a combination of three key innovations and fine-tuning.
arXiv Detail & Related papers (2025-07-25T08:59:30Z) - Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection [0.0]
Multi-modal learning has emerged as a crucial research direction.<n>Existing approaches often suffer from insufficient cross-modal interactions and rigid fusion strategies.<n>We propose Co-AttenDWG, co-attention with dimension-wise gating, and expert fusion.<n>We show that Co-AttenDWG achieves state-of-the-art performance and superior cross-modal alignment.
arXiv Detail & Related papers (2025-05-25T07:26:00Z) - CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning [10.215751315734018]
We propose Contrastive Representation for MoE (CoMoE) to promote modularization and specialization in MoE.<n>Experiments on several benchmarks and in multi-task settings demonstrate that CoMoE can consistently enhance MoE's capacity and promote modularization among the experts.
arXiv Detail & Related papers (2025-05-23T06:58:44Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent
Self-Evolution [92.84441068115517]
Investigate-Consolidate-Exploit (ICE) is a novel strategy for enhancing the adaptability and flexibility of AI agents.
ICE promotes the transfer of knowledge between tasks for genuine self-evolution.
Our experiments on the XAgent framework demonstrate ICE's effectiveness, reducing API calls by as much as 80%.
arXiv Detail & Related papers (2024-01-25T07:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.