Cooperation of Experts: Fusing Heterogeneous Information with Large Margin
- URL: http://arxiv.org/abs/2505.20853v2
- Date: Wed, 28 May 2025 04:31:56 GMT
- Title: Cooperation of Experts: Fusing Heterogeneous Information with Large Margin
- Authors: Shuo Wang, Shunyang Huang, Jinghui Yuan, Zhixiang Shen, Zhao Kang,
- Abstract summary: Cooperation of Experts (CoE) framework encodes multi-typed information into unified heterogeneous multiplex networks.<n>In our framework, dedicated encoders act as domain-specific experts, each specializing in learning distinct relational patterns in specific semantic spaces.
- Score: 11.522412489437702
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fusing heterogeneous information remains a persistent challenge in modern data analysis. While significant progress has been made, existing approaches often fail to account for the inherent heterogeneity of object patterns across different semantic spaces. To address this limitation, we propose the Cooperation of Experts (CoE) framework, which encodes multi-typed information into unified heterogeneous multiplex networks. By overcoming modality and connection differences, CoE provides a powerful and flexible model for capturing the intricate structures of real-world complex data. In our framework, dedicated encoders act as domain-specific experts, each specializing in learning distinct relational patterns in specific semantic spaces. To enhance robustness and extract complementary knowledge, these experts collaborate through a novel large margin mechanism supported by a tailored optimization strategy. Rigorous theoretical analyses guarantee the framework's feasibility and stability, while extensive experiments across diverse benchmarks demonstrate its superior performance and broad applicability. Our code is available at https://github.com/strangeAlan/CoE.
Related papers
- NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-ID [21.162847644106435]
We propose a reliable multi-modal caption generation method based on attribute confidence.<n>We also propose a novel ReID framework NEXT, the Multi-grained Mixture of Experts via Text-Modulation for Multi-modal Object Re-Identification.
arXiv Detail & Related papers (2025-05-26T13:52:28Z) - CoCoAFusE: Beyond Mixtures of Experts via Model Fusion [3.501882879116058]
CoCoAFusE builds on the philosophy behind Mixtures of Experts (MoEs)<n>Our formulation extends that of a classical Mixture of Experts by contemplating the fusion of the experts' distributions.<n>This new approach is showcased extensively on a suite of motivating numerical examples and a collection of real-data ones.
arXiv Detail & Related papers (2025-05-02T08:35:04Z) - UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities [53.76854299076118]
UniversalRAG is a novel RAG framework designed to retrieve and integrate knowledge from heterogeneous sources with diverse modalities and granularities.<n>We propose a modality-aware routing mechanism that dynamically identifies the most appropriate modality-specific corpus and performs targeted retrieval within it.<n>We validate UniversalRAG on 8 benchmarks spanning multiple modalities, showing its superiority over various modality-specific and unified baselines.
arXiv Detail & Related papers (2025-04-29T13:18:58Z) - Convergence Rates for Softmax Gating Mixture of Experts [78.3687645289918]
Mixture of experts (MoE) has emerged as an effective framework to advance the efficiency and scalability of machine learning models.<n>Central to the success of MoE is an adaptive softmax gating mechanism which takes responsibility for determining the relevance of each expert to a given input and then dynamically assigning experts their respective weights.<n>We perform a convergence analysis of parameter estimation and expert estimation under the MoE equipped with the standard softmax gating or its variants, including a dense-to-sparse gating and a hierarchical softmax gating.
arXiv Detail & Related papers (2025-03-05T06:11:24Z) - GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [46.55831783809377]
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs)<n>We develop PruningRAG, a plug-and-play RAG framework that uses multi-granularity pruning strategies to more effectively incorporate relevant context and mitigate the negative impact of misleading information.
arXiv Detail & Related papers (2024-09-03T03:31:37Z) - Discovering Common Information in Multi-view Data [35.37807004353416]
We introduce an innovative and mathematically rigorous definition for computing common information from multi-view data.
We develop a novel supervised multi-view learning framework to capture both common and unique information.
arXiv Detail & Related papers (2024-06-21T10:47:06Z) - Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study [65.11303133775857]
Mixture-of-Experts (MoE) computation amalgamates predictions from several specialized sub-models (referred to as experts)
Sparse MoE selectively engages only a limited number, or even just one expert, significantly reducing overhead while empirically preserving, and sometimes even enhancing, performance.
arXiv Detail & Related papers (2024-03-26T05:48:02Z) - Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset
and Comprehensive Framework [51.44863255495668]
Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence.
We present Multi-Modal Reasoning(COCO-MMR) dataset, a novel dataset that encompasses an extensive collection of open-ended questions.
We propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders.
arXiv Detail & Related papers (2023-07-24T08:58:25Z) - Modeling Multiple Views via Implicitly Preserving Global Consistency and
Local Complementarity [61.05259660910437]
We propose a global consistency and complementarity network (CoCoNet) to learn representations from multiple views.
On the global stage, we reckon that the crucial knowledge is implicitly shared among views, and enhancing the encoder to capture such knowledge can improve the discriminability of the learned representations.
Lastly on the local stage, we propose a complementarity-factor, which joints cross-view discriminative knowledge, and it guides the encoders to learn not only view-wise discriminability but also cross-view complementary information.
arXiv Detail & Related papers (2022-09-16T09:24:00Z) - Feature Clustering for Support Identification in Extreme Regions [5.6928413790238865]
A common characterization of extremes' dependence structure is the angular measure.
The present paper develops a novel optimization-based approach to assess the dependence structure of extremes.
arXiv Detail & Related papers (2020-08-13T15:51:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.