DI3CL: Contrastive Learning With Dynamic Instances and Contour Consistency for SAR Land-Cover Classification Foundation Model
- URL: http://arxiv.org/abs/2511.07808v2
- Date: Thu, 13 Nov 2025 01:22:59 GMT
- Title: DI3CL: Contrastive Learning With Dynamic Instances and Contour Consistency for SAR Land-Cover Classification Foundation Model
- Authors: Zhongle Ren, Hui Ding, Kai Wang, Biao Hou, Xingyu Luo, Weibin Li, Licheng Jiao,
- Abstract summary: This paper develops a general-purpose foundation model for SAR land-cover classification.<n>It incorporates a Dynamic Instance and Contour Consistency Contrastive Learning (DI3CL) pre-training framework.<n>The results consistently demonstrate that the proposed DI3CL outperforms existing methods.
- Score: 47.09803039926004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although significant advances have been achieved in SAR land-cover classification, recent methods remain predominantly focused on supervised learning, which relies heavily on extensive labeled datasets. This dependency not only limits scalability and generalization but also restricts adaptability to diverse application scenarios. In this paper, a general-purpose foundation model for SAR land-cover classification is developed, serving as a robust cornerstone to accelerate the development and deployment of various downstream models. Specifically, a Dynamic Instance and Contour Consistency Contrastive Learning (DI3CL) pre-training framework is presented, which incorporates a Dynamic Instance (DI) module and a Contour Consistency (CC) module. DI module enhances global contextual awareness by enforcing local consistency across different views of the same region. CC module leverages shallow feature maps to guide the model to focus on the geometric contours of SAR land-cover objects, thereby improving structural discrimination. Additionally, to enhance robustness and generalization during pre-training, a large-scale and diverse dataset named SARSense, comprising 460,532 SAR images, is constructed to enable the model to capture comprehensive and representative features. To evaluate the generalization capability of our foundation model, we conducted extensive experiments across a variety of SAR land-cover classification tasks, including SAR land-cover mapping, water body detection, and road extraction. The results consistently demonstrate that the proposed DI3CL outperforms existing methods. Our code and pre-trained weights are publicly available at: https://github.com/SARpre-train/DI3CL.
Related papers
- The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking [31.961181244685932]
We propose a novel Topology-Driven Transferability Estimation framework that evaluates manifold tractability rather than statistical overlap.<n>Our approach significantly outperforms state-of-the-art baselines by around textbf31% relative improvement in the weighted Kendall.
arXiv Detail & Related papers (2026-02-27T11:04:15Z) - FUSAR-KLIP: Towards Multimodal Foundation Models for Remote Sensing [16.948824707021412]
Cross-modal artificial intelligence has garnered widespread attention in recent years, achieving significant progress in the study of natural images.<n>Existing methods are mostly designed for RGB imagery, leaving a significant gap in modeling synthetic aperture radar (SAR) imagery.<n>This paper proposes FUSAR-KLIP, the first universal SAR multimodal foundational model, along with reusable data and evaluation baselines.
arXiv Detail & Related papers (2025-09-28T15:03:25Z) - Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z) - Towards Scalable and Generalizable Earth Observation Data Mining via Foundation Model Composition [0.0]
We investigate whether foundation models pretrained on remote sensing and general vision datasets can be effectively combined to improve performance.<n>The results show that feature-level ensembling of smaller pretrained models can match or exceed the performance of much larger models.<n>The study highlights the potential of applying knowledge distillation to transfer the strengths of ensembles into more compact models.
arXiv Detail & Related papers (2025-06-25T07:02:42Z) - SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds [22.753452376062565]
We present Salience-Aware Structured Separation (SASep) for 3D object recognition.<n>SASep includes (i) a tunable semantic decomposition (TSD) module to semantically decompose objects into important and unimportant parts, (ii) a geometric strategy (GSS) to generate pseudo-unknown objects, and (iii) a synth-aided margin separation (SMS) module to enhance feature-level separation.<n> Experimental results show that SASep achieves superior performance in 3D OSR, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2025-06-16T08:22:11Z) - GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation [52.910282443646864]
A key challenge in model-free category-level pose estimation is the extraction of contextual object features that generalize across varying instances within a specific category.<n>We present GCE-Pose, a method that enhances pose estimation for novel instances by integrating category-level global context prior.
arXiv Detail & Related papers (2025-02-06T18:35:13Z) - High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Seismic Fault SAM: Adapting SAM with Lightweight Modules and 2.5D Strategy for Fault Detection [11.868792440783054]
This paper proposes Seismic Fault SAM, which applies the general pre-training foundation model-Segment Anything Model (SAM)-to seismic fault interpretation.
Our innovative points include designing lightweight Adapter modules, freezing most of the pre-training weights, and only updating a small number of parameters.
Experimental results on the largest publicly available seismic dataset, Thebe, show that our method surpasses existing 3D models on both OIS and ODS metrics.
arXiv Detail & Related papers (2024-07-19T08:38:48Z) - Style-Hallucinated Dual Consistency Learning: A Unified Framework for
Visual Domain Generalization [113.03189252044773]
We propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle domain shift in various visual tasks.
Our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2022-12-18T11:42:51Z) - Learning to Explore using Active Neural SLAM [99.42064696897533]
This work presents a modular and hierarchical approach to learn policies for exploring 3D environments.
The proposed model can also be easily transferred to the PointGoal task and was the winning entry of the CVPR 2019 Habitat PointGoal Navigation Challenge.
arXiv Detail & Related papers (2020-04-10T17:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.