Related papers: Mitigating Modality Quantity and Quality Imbalance in Multimodal Online Federated Learning

Mitigating Modality Quantity and Quality Imbalance in Multimodal Online Federated Learning

URL: http://arxiv.org/abs/2508.11159v1
Date: Fri, 15 Aug 2025 02:13:39 GMT
Title: Mitigating Modality Quantity and Quality Imbalance in Multimodal Online Federated Learning
Authors: Heqiang Wang, Weihong Yang, Xiaoxiong Zhong, Jia Zhou, Fangming Liu, Weizhe Zhang,
Abstract summary: Internet of Things (IoT) ecosystem produces massive volumes of multimodal data from diverse sources, including sensors, cameras, and microphones.<n>With advances in edge intelligence, IoT devices have evolved from simple data acquisition units into computationally capable nodes, enabling localized processing of heterogeneous multimodal data.<n>The continuous nature of data generation and the limited storage capacity of edge devices demand an online learning framework. Multimodal Online Federated Learning (MMO-FL) has emerged as a promising approach to meet these requirements.<n>We propose the Modality Quantity and Quality Rebalanced (QQR) algorithm, a prototype learning based method designed to operate in parallel with the training
Score: 17.228105810116762
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Internet of Things (IoT) ecosystem produces massive volumes of multimodal data from diverse sources, including sensors, cameras, and microphones. With advances in edge intelligence, IoT devices have evolved from simple data acquisition units into computationally capable nodes, enabling localized processing of heterogeneous multimodal data. This evolution necessitates distributed learning paradigms that can efficiently handle such data. Furthermore, the continuous nature of data generation and the limited storage capacity of edge devices demand an online learning framework. Multimodal Online Federated Learning (MMO-FL) has emerged as a promising approach to meet these requirements. However, MMO-FL faces new challenges due to the inherent instability of IoT devices, which often results in modality quantity and quality imbalance (QQI) during data collection. In this work, we systematically investigate the impact of QQI within the MMO-FL framework and present a comprehensive theoretical analysis quantifying how both types of imbalance degrade learning performance. To address these challenges, we propose the Modality Quantity and Quality Rebalanced (QQR) algorithm, a prototype learning based method designed to operate in parallel with the training process. Extensive experiments on two real-world multimodal datasets show that the proposed QQR algorithm consistently outperforms benchmarks under modality imbalance conditions with promising learning performance.

Related papers

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in modern Transformers [59.472505916020936]
We investigate how transformers learn to associate information across modalities from in-context examples.<n>We revisit core principles of unimodal ICL in modern transformers.<n>Mechanistic analysis shows that both settings rely on an induction-style mechanism that copies labels from matching in-context exemplars.
arXiv Detail & Related papers (2026-01-28T17:37:28Z)
SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams [53.78257200138774]
We propose a Self-Evolving Relevance Model approach (SERM), which comprises two complementary multi-agent modules.<n>We evaluate SERM in a large-scale industrial setting, which serves billions of user requests daily.
arXiv Detail & Related papers (2026-01-14T14:31:16Z)
Towards Heterogeneous Quantum Federated Learning: Challenges and Solutions [47.08625631041616]
Quantum federated learning (QFL) combines quantum computing and federated learning to enable decentralized model training while maintaining data privacy.<n>Existing QFL frameworks largely focus on homogeneity among quantum textcolorblackclients, and they do not account for real-world variances in quantum data distributions, encoding techniques, hardware noise levels, and computational capacity.<n>These differences can create instability during training, slow convergence, and reduce overall model performance.
arXiv Detail & Related papers (2025-11-27T06:35:45Z)
Communication Efficient Adaptive Model-Driven Quantum Federated Learning [13.782852293291493]
Training with huge datasets and a large number of participating devices leads to bottlenecks in federated learning (FL)<n>We introduce a model-driven quantum federated learning algorithm (mdQFL) to tackle these challenges.<n>Our results demonstrate a nearly 50% decrease in total communication costs while maintaining or, in some cases, exceeding the accuracy of the final model.
arXiv Detail & Related papers (2025-06-05T01:48:00Z)
Multimodal Online Federated Learning with Modality Missing in Internet of Things [22.814768356671276]
Internet of Things (IoT) ecosystem generates vast amounts of multimodal data from heterogeneous sources such as sensors, cameras, and microphones.<n>As edge intelligence continues to evolve, IoT devices have progressed from simple data collection units to nodes capable of executing complex computational tasks.<n>We introduce the concept of Multimodal Online Federated Learning (MMO-FL), a novel framework designed for dynamic and decentralized multimodal learning in IoT environments.
arXiv Detail & Related papers (2025-05-22T02:31:37Z)
A Low-Complexity Plug-and-Play Deep Learning Model for Massive MIMO Precoding Across Sites [5.896656636095934]
MMIMO technology has transformed wireless communication by enhancing spectral efficiency and network capacity.<n>This paper proposes a novel deep learning-based mMIMO precoder to tackle the complexity challenges of existing approaches.
arXiv Detail & Related papers (2025-02-12T20:02:36Z)
PAL: Prompting Analytic Learning with Missing Modality for Multi-Modal Class-Incremental Learning [42.00851701431368]
Multi-modal class-incremental learning (MMCIL) seeks to leverage multi-modal data, such as audio-visual and image-text pairs.<n>A critical challenge remains: the issue of missing modalities during incremental learning phases.<n>We propose PAL, a novel exemplar-free framework tailored to MMCIL under missing-modality scenarios.
arXiv Detail & Related papers (2025-01-16T08:04:04Z)
Multi-QuAD: Multi-Level Quality-Adaptive Dynamic Network for Reliable Multimodal Classification [57.08108545219043]
Existing reliable multimodal classification methods fail to provide robust estimation of data quality.<n>New framework for reliable classification termed textitMulti-level Quality-Adaptive Dynamic multimodal network (Multi-QuAD) is proposed.<n>Experiments conducted on four datasets demonstrate that Multi-QuAD significantly outperforms state-of-the-art methods in classification performance and reliability.
arXiv Detail & Related papers (2024-12-19T03:26:51Z)
MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services [94.61039892220037]
We propose an immersion-aware model trading framework that facilitates data provision for services while ensuring privacy through federated learning (FL)<n>We design an incentive mechanism to incentivize metaverse users (MUs) to contribute high-value models under resource constraints.<n>We develop a fully distributed dynamic reward algorithm based on deep reinforcement learning, without accessing any private information about MUs and other MSPs.
arXiv Detail & Related papers (2024-10-25T16:20:46Z)
Multimodal deep representation learning for quantum cross-platform verification [60.01590250213637]
Cross-platform verification, a critical undertaking in the realm of early-stage quantum computing, endeavors to characterize the similarity of two imperfect quantum devices executing identical algorithms. We introduce an innovative multimodal learning approach, recognizing that the formalism of data in this task embodies two distinct modalities. We devise a multimodal neural network to independently extract knowledge from these modalities, followed by a fusion operation to create a comprehensive data representation.
arXiv Detail & Related papers (2023-11-07T04:35:03Z)
Differentiable Multi-Fidelity Fusion: Efficient Learning of Physics Simulations with Neural Architecture Search and Transfer Learning [1.0024450637989093]
We propose the differentiable mf (DMF) model, which leverages neural architecture search (NAS) to automatically search the suitable model architecture for different problems. DMF can efficiently learn the physics simulations with only a few high-fidelity training samples, and outperform the state-of-the-art methods with a significant margin.
arXiv Detail & Related papers (2023-06-12T07:18:13Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Ternary Compression for Communication-Efficient Federated Learning [17.97683428517896]
Federated learning provides a potential solution to privacy-preserving and secure machine learning. We propose a ternary federated averaging protocol (T-FedAvg) to reduce the upstream and downstream communication of federated learning systems. Our results show that the proposed T-FedAvg is effective in reducing communication costs and can even achieve slightly better performance on non-IID data.
arXiv Detail & Related papers (2020-03-07T11:55:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.