Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
- URL: http://arxiv.org/abs/2506.23759v1
- Date: Mon, 30 Jun 2025 12:08:02 GMT
- Title: Spatio-Temporal Representation Decoupling and Enhancement for Federated Instrument Segmentation in Surgical Videos
- Authors: Zheng Fang, Xiaoming Qi, Chun-Mei Feng, Jialun Pei, Weixin Si, Yueming Jin,
- Abstract summary: We propose a novel Personalized FL scheme, Spatio-Temporal Representation Decoupling and Enhancement (FedST)<n>FedST wisely leverages surgical domain knowledge during both local-site and global-server training to boost segmentation.<n>Our model embraces a Representation Separation and Cooperation (RSC) mechanism in local-site training, which decouples the query embedding layer to be trained privately, to encode respective backgrounds.
- Score: 17.212640372275466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Surgical instrument segmentation under Federated Learning (FL) is a promising direction, which enables multiple surgical sites to collaboratively train the model without centralizing datasets. However, there exist very limited FL works in surgical data science, and FL methods for other modalities do not consider inherent characteristics in surgical domain: i) different scenarios show diverse anatomical backgrounds while highly similar instrument representation; ii) there exist surgical simulators which promote large-scale synthetic data generation with minimal efforts. In this paper, we propose a novel Personalized FL scheme, Spatio-Temporal Representation Decoupling and Enhancement (FedST), which wisely leverages surgical domain knowledge during both local-site and global-server training to boost segmentation. Concretely, our model embraces a Representation Separation and Cooperation (RSC) mechanism in local-site training, which decouples the query embedding layer to be trained privately, to encode respective backgrounds. Meanwhile, other parameters are optimized globally to capture the consistent representations of instruments, including the temporal layer to capture similar motion patterns. A textual-guided channel selection is further designed to highlight site-specific features, facilitating model adapta tion to each site. Moreover, in global-server training, we propose Synthesis-based Explicit Representation Quantification (SERQ), which defines an explicit representation target based on synthetic data to synchronize the model convergence during fusion for improving model generalization.
Related papers
- Generative Autoregressive Transformers for Model-Agnostic Federated MRI Reconstruction [5.519160766363227]
FedGAT is a model-agnostic FL technique based on generative autoregressive transformers.<n>Controllably synthesizes realistic MR images from desired sites via autoregressive prediction.<n>Each site then trains its own reconstruction model on a hybrid dataset.
arXiv Detail & Related papers (2025-02-06T21:45:16Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Unifying and Personalizing Weakly-supervised Federated Medical Image
Segmentation via Adaptive Representation and Aggregation [1.121358474059223]
Federated learning (FL) enables multiple sites to collaboratively train powerful deep models without compromising data privacy and security.
Weakly supervised segmentation, which uses sparsely-grained supervision, is increasingly being paid attention to due to its great potential of reducing annotation costs.
We propose a novel personalized FL framework for medical image segmentation, named FedICRA, which uniformly leverages heterogeneous weak supervision.
arXiv Detail & Related papers (2023-04-12T06:32:08Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - One Model to Unite Them All: Personalized Federated Learning of
Multi-Contrast MRI Synthesis [5.3963856146595095]
Learning-based MRI translation involves a synthesis model that maps a source-contrast onto a target-contrast image.
Here we introduce the first personalized FL method for MRI Synthesis (pFL Synth)
pFL Synth is based on an adversarial model equipped with a mapper that produces latents specific to individual sites and source-target contrasts.
arXiv Detail & Related papers (2022-07-13T20:14:16Z) - Personalizing Federated Medical Image Segmentation via Local Calibration [9.171482226385551]
Using a single model to adapt to various data distributions from different sites is extremely challenging.
We propose a personalized federated framework with textbfLocal textbfCalibration (LC-Fed) to leverage the inter-site in-consistencies.
Our method consistently shows superior performance to the state-of-the-art personalized FL methods.
arXiv Detail & Related papers (2022-07-11T06:30:31Z) - Style-Hallucinated Dual Consistency Learning for Domain Generalized
Semantic Segmentation [117.3856882511919]
We propose the Style-HAllucinated Dual consistEncy learning (SHADE) framework to handle domain shift.
Our SHADE yields significant improvement and outperforms state-of-the-art methods by 5.07% and 8.35% on the average mIoU of three real-world datasets.
arXiv Detail & Related papers (2022-04-06T02:49:06Z) - Federated and Generalized Person Re-identification through Domain and
Feature Hallucinating [88.77196261300699]
We study the problem of federated domain generalization (FedDG) for person re-identification (re-ID)
We propose a novel method, called "Domain and Feature Hallucinating (DFH)", to produce diverse features for learning generalized local and global models.
Our method achieves the state-of-the-art performance for FedDG on four large-scale re-ID benchmarks.
arXiv Detail & Related papers (2022-03-05T09:15:13Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.