FedFusion: Manifold Driven Federated Learning for Multi-satellite and
Multi-modality Fusion
- URL: http://arxiv.org/abs/2311.09540v1
- Date: Thu, 16 Nov 2023 03:29:19 GMT
- Title: FedFusion: Manifold Driven Federated Learning for Multi-satellite and
Multi-modality Fusion
- Authors: DaiXun Li, Weiying Xie, Yunsong Li, Leyuan Fang
- Abstract summary: This paper proposes a manifold-driven multi-modality fusion framework, FedFusion, which randomly samples local data on each client to jointly estimate the prominent manifold structure of shallow features of each client.
Considering the physical space limitations of the satellite constellation, we developed a multimodal federated learning module designed specifically for manifold data in a deep latent space.
The proposed framework surpasses existing methods in terms of performance on three multimodal datasets, achieving a classification average accuracy of 94.35$%$ while compressing communication costs by a factor of 4.
- Score: 30.909597853659506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-satellite, multi-modality in-orbit fusion is a challenging task as it
explores the fusion representation of complex high-dimensional data under
limited computational resources. Deep neural networks can reveal the underlying
distribution of multi-modal remote sensing data, but the in-orbit fusion of
multimodal data is more difficult because of the limitations of different
sensor imaging characteristics, especially when the multimodal data follows
non-independent identically distribution (Non-IID) distributions. To address
this problem while maintaining classification performance, this paper proposes
a manifold-driven multi-modality fusion framework, FedFusion, which randomly
samples local data on each client to jointly estimate the prominent manifold
structure of shallow features of each client and explicitly compresses the
feature matrices into a low-rank subspace through cascading and additive
approaches, which is used as the feature input of the subsequent classifier.
Considering the physical space limitations of the satellite constellation, we
developed a multimodal federated learning module designed specifically for
manifold data in a deep latent space. This module achieves iterative updating
of the sub-network parameters of each client through global weighted averaging,
constructing a framework that can represent compact representations of each
client. The proposed framework surpasses existing methods in terms of
performance on three multimodal datasets, achieving a classification average
accuracy of 94.35$\%$ while compressing communication costs by a factor of 4.
Furthermore, extensive numerical evaluations of real-world satellite images
were conducted on the orbiting edge computing architecture based on Jetson TX2
industrial modules, which demonstrated that FedFusion significantly reduced
training time by 48.4 minutes (15.18%) while optimizing accuracy.}
Related papers
- Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation [61.91492500828508]
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal support samples.
We introduce a cost-free multimodal FS-PCS setup, utilizing textual labels and the potentially available 2D image modality.
We propose a simple yet effective Test-time Adaptive Cross-modal Seg (TACC) technique to mitigate training bias.
arXiv Detail & Related papers (2024-10-29T19:28:41Z) - Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities.
The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z) - Hierarchical Attention and Parallel Filter Fusion Network for Multi-Source Data Classification [33.26466989592473]
We propose a hierarchical attention and parallel filter fusion network for multi-source data classification.
Our proposed method achieves 91.44% and 80.51% of overall accuracy (OA) on the respective datasets.
arXiv Detail & Related papers (2024-08-22T23:14:22Z) - LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing [25.016421338677816]
Current methods often process only two types of data, missing out on the rich information that additional modalities can provide.
We propose a novel textbfLightweight textbfMultimodal data textbfFusion textbfNetwork (LMFNet)
LMFNet accommodates various data types simultaneously, including RGB, NirRG, and DSM, through a weight-sharing, multi-branch vision transformer.
arXiv Detail & Related papers (2024-04-21T13:29:42Z) - Multi-view Aggregation Network for Dichotomous Image Segmentation [76.75904424539543]
Dichotomous Image (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.
Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement.
Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet)
Experiments on the popular DIS-5K dataset show that our MVANet significantly outperforms state-of-the-art methods in both accuracy and speed.
arXiv Detail & Related papers (2024-04-11T03:00:00Z) - PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest [65.48057241587398]
PoIFusion is a framework to fuse information of RGB images and LiDAR point clouds at the points of interest (PoIs)
Our approach maintains the view of each modality and obtains multi-modal features by computation-friendly projection and computation.
We conducted extensive experiments on nuScenes and Argoverse2 datasets to evaluate our approach.
arXiv Detail & Related papers (2024-03-14T09:28:12Z) - FedDiff: Diffusion Model Driven Federated Learning for Multi-Modal and
Multi-Clients [32.59184269562571]
We propose a multi-modal collaborative diffusion federated learning framework called FedDiff.
Our framework establishes a dual-branch diffusion model feature extraction setup, where the two modal data are inputted into separate branches of the encoder.
Considering the challenge of private and efficient communication between multiple clients, we embed the diffusion model into the federated learning communication structure.
arXiv Detail & Related papers (2023-11-16T02:29:37Z) - Efficient and Effective Deep Multi-view Subspace Clustering [9.6753782215283]
We propose a novel deep framework, termed Efficient and Effective deep Multi-View Subspace Clustering (E$2$MVSC)
Instead of a parameterized FC layer, we design a Relation-Metric Net that decouples network parameter scale from sample numbers for greater computational efficiency.
E$2$MVSC yields comparable results to existing methods and achieves state-of-the-art performance in various types of multi-view datasets.
arXiv Detail & Related papers (2023-10-15T03:08:25Z) - General-Purpose Multimodal Transformer meets Remote Sensing Semantic
Segmentation [35.100738362291416]
Multimodal AI seeks to exploit complementary data sources, particularly for complex tasks like semantic segmentation.
Recent trends in general-purpose multimodal networks have shown great potential to achieve state-of-the-art performance.
We propose a UNet-inspired module that employs 3D convolution to encode vital local information and learn cross-modal features simultaneously.
arXiv Detail & Related papers (2023-07-07T04:58:34Z) - BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU.
We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors.
We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.