Drift-aware Collaborative Assistance Mixture of Experts for Heterogeneous Multistream Learning
- URL: http://arxiv.org/abs/2508.01598v1
- Date: Sun, 03 Aug 2025 05:35:34 GMT
- Title: Drift-aware Collaborative Assistance Mixture of Experts for Heterogeneous Multistream Learning
- Authors: En Yu, Jie Lu, Kun Wang, Xiaoyu Yang, Guangquan Zhang,
- Abstract summary: Learning from multiple data streams in real-world scenarios is fundamentally challenging due to intrinsic heterogeneity and unpredictable concept drifts.<n>Existing methods typically assume homogeneous streams and employ static architectures with indiscriminate knowledge fusion.<n>We propose CAMEL, a framework that assigns each stream an independent system with a dedicated feature extractor and task-specific head.<n>Furthermore, we propose an Autonomous Expert Tuner (AET) strategy, which dynamically manages expert lifecycles in response to drift.
- Score: 31.877595633244734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from multiple data streams in real-world scenarios is fundamentally challenging due to intrinsic heterogeneity and unpredictable concept drifts. Existing methods typically assume homogeneous streams and employ static architectures with indiscriminate knowledge fusion, limiting generalizability in complex dynamic environments. To tackle this gap, we propose CAMEL, a dynamic \textbf{C}ollaborative \textbf{A}ssistance \textbf{M}ixture of \textbf{E}xperts \textbf{L}earning framework. It addresses heterogeneity by assigning each stream an independent system with a dedicated feature extractor and task-specific head. Meanwhile, a dynamic pool of specialized private experts captures stream-specific idiosyncratic patterns. Crucially, collaboration across these heterogeneous streams is enabled by a dedicated assistance expert. This expert employs a multi-head attention mechanism to distill and integrate relevant context autonomously from all other concurrent streams. It facilitates targeted knowledge transfer while inherently mitigating negative transfer from irrelevant sources. Furthermore, we propose an Autonomous Expert Tuner (AET) strategy, which dynamically manages expert lifecycles in response to drift. It instantiates new experts for emerging concepts (freezing prior ones to prevent catastrophic forgetting) and prunes obsolete ones. This expert-level plasticity provides a robust and efficient mechanism for online model capacity adaptation. Extensive experiments demonstrate CAMEL's superior generalizability across diverse multistreams and exceptional resilience against complex concept drifts.
Related papers
- DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts [1.2487037582320804]
This paper introduces DriftMoE, an online Mixture-of-Experts (MoE) architecture that addresses limitations through a novel co-training framework.<n> DriftMoE features a compact neural router that is co-trained alongside a pool of incremental Hoeffding tree experts.<n>We evaluate DriftMoE's performance across nine state-of-the-art data stream learning benchmarks spanning abrupt, gradual, and real-world drifts.
arXiv Detail & Related papers (2025-07-24T14:39:20Z) - FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z) - Hecto: Modular Sparse Experts for Adaptive and Interpretable Reasoning [0.0]
Hecto is a lightweight MoE architecture that combines a GRU expert for temporal reasoning and an FFNN expert for static abstraction under a sparse Top-1 gating mechanism.<n>Hecto matches or closely trails homogeneous baselines in performance despite receiving isolated input representations.<n>Hecto establishes itself as a new benchmark for conditional computation.
arXiv Detail & Related papers (2025-06-28T15:03:43Z) - Cooperation of Experts: Fusing Heterogeneous Information with Large Margin [11.522412489437702]
Cooperation of Experts (CoE) framework encodes multi-typed information into unified heterogeneous multiplex networks.<n>In our framework, dedicated encoders act as domain-specific experts, each specializing in learning distinct relational patterns in specific semantic spaces.
arXiv Detail & Related papers (2025-05-27T08:04:32Z) - CoCoAFusE: Beyond Mixtures of Experts via Model Fusion [3.501882879116058]
CoCoAFusE builds on the philosophy behind Mixtures of Experts (MoEs)<n>Our formulation extends that of a classical Mixture of Experts by contemplating the fusion of the experts' distributions.<n>This new approach is showcased extensively on a suite of motivating numerical examples and a collection of real-data ones.
arXiv Detail & Related papers (2025-05-02T08:35:04Z) - An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas.<n>We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z) - Deep Reinforcement Learning with Hybrid Intrinsic Reward Model [50.53705050673944]
Intrinsic reward shaping has emerged as a prevalent approach to solving hard-exploration and sparse-rewards environments.<n>We introduce HIRE (Hybrid Intrinsic REward), a framework for creating hybrid intrinsic rewards through deliberate fusion strategies.
arXiv Detail & Related papers (2025-01-22T04:22:13Z) - Complexity Experts are Task-Discriminative Learners for Any Image Restoration [80.46313715427928]
We introduce complexity experts" -- flexible expert blocks with varying computational complexity and receptive fields.<n>This preference effectively drives task-specific allocation, assigning tasks to experts with the appropriate complexity.<n>The proposed MoCE-IR model outperforms state-of-the-art methods, affirming its efficiency and practical applicability.
arXiv Detail & Related papers (2024-11-27T15:58:07Z) - DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.<n>DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.<n>Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - Flexible and Adaptable Summarization via Expertise Separation [59.26639426529827]
A proficient summarization model should exhibit both flexibility and adaptability.
We propose MoeSumm, a Mixture-of-Expert Summarization architecture.
Our model's distinct separation of general and domain-specific summarization abilities grants it with notable flexibility and adaptability.
arXiv Detail & Related papers (2024-06-08T05:31:19Z) - Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study [65.11303133775857]
Mixture-of-Experts (MoE) computation amalgamates predictions from several specialized sub-models (referred to as experts)
Sparse MoE selectively engages only a limited number, or even just one expert, significantly reducing overhead while empirically preserving, and sometimes even enhancing, performance.
arXiv Detail & Related papers (2024-03-26T05:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.