Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation
- URL: http://arxiv.org/abs/2412.06664v3
- Date: Tue, 14 Jan 2025 08:33:08 GMT
- Title: Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation
- Authors: Shun Zhang, Xuechao Zou, Kai Li, Congyan Lang, Shiying Wang, Pin Tao, Tengfei Cao,
- Abstract summary: We introduce a novel end-to-end learning paradigm combining knowledge guidance with domain refinement to enhance performance.
We present two key components: the Feature Alignment Module (FAM) and the Feature Modulation Module (FMM)
Experiments show that our method achieves a significant improvement of 2.57 mIoU on the grass dataset and 3.73 mIoU on the cloud dataset.
- Score: 11.268182306510802
- License:
- Abstract: Fine-grained remote sensing image segmentation is essential for accurately identifying detailed objects in remote sensing images. Recently, vision transformer models (VTMs) pre-trained on large-scale datasets have demonstrated strong zero-shot generalization. However, directly applying them to specific tasks may lead to domain shift. We introduce a novel end-to-end learning paradigm combining knowledge guidance with domain refinement to enhance performance. We present two key components: the Feature Alignment Module (FAM) and the Feature Modulation Module (FMM). FAM aligns features from a CNN-based backbone with those from the pretrained VTM's encoder using channel transformation and spatial interpolation, and transfers knowledge via KL divergence and L2 normalization constraint. FMM further adapts the knowledge to the specific domain to address domain shift. We also introduce a fine-grained grass segmentation dataset and demonstrate, through experiments on two datasets, that our method achieves a significant improvement of 2.57 mIoU on the grass dataset and 3.73 mIoU on the cloud dataset. The results highlight the potential of combining knowledge transfer and domain adaptation to overcome domain-related challenges and data limitations. The project page is available at https://xavierjiezou.github.io/KTDA/.
Related papers
- SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing [13.549403813487022]
Unsupervised domain adaptation (UDA) enables models to learn from unlabeled target domain data while leveraging labeled source domain data.
We propose integrating contrastive learning into UDA, enhancing the model's ability to capture semantic information in the target domain.
Our method, SimSeg, outperforms existing approaches, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:59:39Z) - Compositional Semantic Mix for Domain Adaptation in Point Cloud
Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation.
We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z) - MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets [19.44142290594537]
Vision transformers (ViTs) have emerged as a promising solution to improve medical image segmentation (MIS)
ViTs are typically trained using a single source of data, which overlooks the valuable knowledge that could be leveraged from other available datasets.
In this paper, we propose MDViT, the first multi-domain ViT that includes domain adapters to mitigate data-hunger and combat NKT.
arXiv Detail & Related papers (2023-07-05T08:19:29Z) - Source-Free Domain Adaptation for RGB-D Semantic Segmentation with
Vision Transformers [11.13182313760599]
We propose MISFIT: MultImodal Source-Free Information fusion Transformer, a depth-aware framework for source-free semantic segmentation.
Our framework, which is also the first approach using RGB-D vision transformers for source-free semantic segmentation, shows noticeable performance improvements.
arXiv Detail & Related papers (2023-05-23T17:20:47Z) - Fake it, Mix it, Segment it: Bridging the Domain Gap Between Lidar
Sensors [0.966840768820136]
Best performing neural networks for lidar segmentation are fine-tuned to specific datasets.
switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift.
We propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor.
arXiv Detail & Related papers (2022-12-19T14:57:13Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Boosting Few-shot Semantic Segmentation with Transformers [81.43459055197435]
TRansformer-based Few-shot Semantic segmentation method (TRFS)
Our model consists of two modules: Global Enhancement Module (GEM) and Local Enhancement Module (LEM)
arXiv Detail & Related papers (2021-08-04T20:09:21Z) - Unsupervised Domain Adaptation for Video Semantic Segmentation [91.30558794056054]
Unsupervised Domain Adaptation for semantic segmentation has gained immense popularity since it can transfer knowledge from simulation to real.
In this work, we present a new video extension of this task, namely Unsupervised Domain Adaptation for Video Semantic approaches.
We show that our proposals significantly outperform previous image-based UDA methods both on image-level (mIoU) and video-level (VPQ) evaluation metrics.
arXiv Detail & Related papers (2021-07-23T07:18:20Z) - Domain Adaptive SiamRPN++ for Object Tracking in the Wild [10.61438063305309]
We introduce a Domain Adaptive SiamRPN++ to improve the cross-domain transferability and robustness of a tracker.
Inspired by A-distance theory, we present two domain adaptive modules, Pixel Domain Adaptation (PDA) and Semantic Domain Adaptation (SDA)
The PDA module aligns the feature maps of template and search region images to eliminate the pixel-level domain shift.
The SDA module aligns the feature representations of the tracking target's appearance to eliminate the semantic-level domain shift.
arXiv Detail & Related papers (2021-06-15T03:40:53Z) - Supervised Domain Adaptation using Graph Embedding [86.3361797111839]
Domain adaptation methods assume that distributions between the two domains are shifted and attempt to realign them.
We propose a generic framework based on graph embedding.
We show that the proposed approach leads to a powerful Domain Adaptation framework.
arXiv Detail & Related papers (2020-03-09T12:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.