FedMobile: Enabling Knowledge Contribution-aware Multi-modal Federated Learning with Incomplete Modalities
- URL: http://arxiv.org/abs/2502.15839v1
- Date: Thu, 20 Feb 2025 15:10:43 GMT
- Title: FedMobile: Enabling Knowledge Contribution-aware Multi-modal Federated Learning with Incomplete Modalities
- Authors: Yi Liu, Cong Wang, Xingliang Yuan,
- Abstract summary: Key challenge in mobile sensing systems using multimodal FL is modality incompleteness.<n>Current multimodal FL frameworks typically train multiple unimodal FL subsystems or apply techniques on the node side to approximate missing modalities.<n>We present FedMobile, a new knowledge contribution-aware multimodal FL framework designed for robust learning despite missing modalities.
- Score: 15.771749047384535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Web of Things (WoT) enhances interoperability across web-based and ubiquitous computing platforms while complementing existing IoT standards. The multimodal Federated Learning (FL) paradigm has been introduced to enhance WoT by enabling the fusion of multi-source mobile sensing data while preserving privacy. However, a key challenge in mobile sensing systems using multimodal FL is modality incompleteness, where some modalities may be unavailable or only partially captured, potentially degrading the system's performance and reliability. Current multimodal FL frameworks typically train multiple unimodal FL subsystems or apply interpolation techniques on the node side to approximate missing modalities. However, these approaches overlook the shared latent feature space among incomplete modalities across different nodes and fail to discriminate against low-quality nodes. To address this gap, we present FedMobile, a new knowledge contribution-aware multimodal FL framework designed for robust learning despite missing modalities. FedMobile prioritizes local-to-global knowledge transfer, leveraging cross-node multimodal feature information to reconstruct missing features. It also enhances system performance and resilience to modality heterogeneity through rigorous node contribution assessments and knowledge contribution-aware aggregation rules. Empirical evaluations on five widely recognized multimodal benchmark datasets demonstrate that FedMobile maintains robust learning even when up to 90% of modality information is missing or when data from two modalities are randomly missing, outperforming state-of-the-art baselines.
Related papers
- NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild.
NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities.
We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z) - Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality [28.90486547668949]
Multimodal Federated Cross Prototype Learning (MFCPL) is a novel approach for MFL under severely missing modalities.
MFCPL provides diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism.
Our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing the overall performance.
arXiv Detail & Related papers (2024-01-25T02:25:23Z) - Multimodal Informative ViT: Information Aggregation and Distribution for
Hyperspectral and LiDAR Classification [25.254816993934746]
Multimodal Informative Vit (MIVit) is a system with an innovative information aggregate-distributing mechanism.
MIVit reduces redundancy in the empirical distribution of each modality's separate and fused features.
Our results show that MIVit's bidirectional aggregate-distributing mechanism is highly effective.
arXiv Detail & Related papers (2024-01-06T09:53:33Z) - Balanced Multi-modal Federated Learning via Cross-Modal Infiltration [19.513099949266156]
Federated learning (FL) underpins advancements in privacy-preserving distributed computing.
We propose a novel Cross-Modal Infiltration Federated Learning (FedCMI) framework.
arXiv Detail & Related papers (2023-12-31T05:50:15Z) - FedMFS: Federated Multimodal Fusion Learning with Selective Modality Communication [11.254610576923204]
We propose Federated Multimodal Fusion learning with Selective modality communication (FedMFS)
Key idea is the introduction of a modality selection criterion for each device, which weighs (i) the impact of the modality, gauged by Shapley value analysis, against (ii) the modality model size as a gauge for communication overhead.
Experiments on the real-world ActionSense dataset demonstrate the ability of FedMFS to achieve comparable accuracy to several baselines while reducing the communication overhead by over 4x.
arXiv Detail & Related papers (2023-10-10T22:23:27Z) - FedMultimodal: A Benchmark For Multimodal Federated Learning [45.0258566979478]
Federated Learning (FL) has become an emerging machine learning technique to tackle data privacy challenges.
Despite significant efforts to FL in fields like computer vision, audio, and natural language processing, the FL applications utilizing multimodal data streams remain largely unexplored.
We introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets.
arXiv Detail & Related papers (2023-06-15T20:31:26Z) - FedWon: Triumphing Multi-domain Federated Learning Without Normalization [50.49210227068574]
Federated learning (FL) enhances data privacy with collaborative in-situ training on decentralized clients.
However, Federated learning (FL) encounters challenges due to non-independent and identically distributed (non-i.i.d) data.
We propose a novel method called Federated learning Without normalizations (FedWon) to address the multi-domain problem in FL.
arXiv Detail & Related papers (2023-06-09T13:18:50Z) - Learning Multimodal Data Augmentation in Feature Space [65.54623807628536]
LeMDA is an easy-to-use method that automatically learns to jointly augment multimodal data in feature space.
We show that LeMDA can profoundly improve the performance of multimodal deep learning architectures.
arXiv Detail & Related papers (2022-12-29T20:39:36Z) - Exploiting modality-invariant feature for robust multimodal emotion
recognition with missing modalities [76.08541852988536]
We propose to use invariant features for a missing modality imagination network (IF-MMIN)
We show that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions.
arXiv Detail & Related papers (2022-10-27T12:16:25Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.