3D-SSM: A Novel 3D Selective Scan Module for Remote Sensing Change Detection
- URL: http://arxiv.org/abs/2506.19263v1
- Date: Tue, 24 Jun 2025 02:46:31 GMT
- Title: 3D-SSM: A Novel 3D Selective Scan Module for Remote Sensing Change Detection
- Authors: Rui Huang, Jincheng Zeng, Sen Gao, Yan Xing,
- Abstract summary: We propose a 3D selective scan module (3D-SSM) that captures global information from both spatial plane and channel perspectives.<n>We present two key components: atemporal interaction module (SIM) and a multi-branch extraction module (MBFEM)<n>Our proposed method demonstrates favourable performance compared to state-of-the-temporal change detection methods on five benchmark datasets.
- Score: 6.142826091422512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Mamba-based approaches in remote sensing change detection have enhanced scanning models, yet remain limited by their inability to capture long-range dependencies between image channels effectively, which restricts their feature representation capabilities. To address this limitation, we propose a 3D selective scan module (3D-SSM) that captures global information from both the spatial plane and channel perspectives, enabling a more comprehensive understanding of the data.Based on the 3D-SSM, we present two key components: a spatiotemporal interaction module (SIM) and a multi-branch feature extraction module (MBFEM). The SIM facilitates bi-temporal feature integration by enabling interactions between global and local features across images from different time points, thereby enhancing the detection of subtle changes. Meanwhile, the MBFEM combines features from the frequency domain, spatial domain, and 3D-SSM to provide a rich representation of contextual information within the image. Our proposed method demonstrates favourable performance compared to state-of-the-art change detection methods on five benchmark datasets through extensive experiments. Code is available at https://github.com/VerdantMist/3D-SSM
Related papers
- SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection [24.367371441506116]
Multimodal 3D object detection based on deep neural networks has indeed made significant progress.<n>However, it still faces challenges due to the misalignment of scale and spatial information between features extracted from 2D images and those derived from 3D point clouds.<n>We present SSLFusion, a novel scale & Space Aligned Latent Fusion Model, consisting of a scale-aligned fusion strategy, a 3D-to-2D space alignment module, and a latent cross-modal fusion module.
arXiv Detail & Related papers (2025-04-07T15:15:06Z) - Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework [44.44329455757931]
In autonomous driving, LiDAR sensors are vital for acquiring 3D point clouds, providing reliable geometric information.<n>Traditional sampling methods of preprocessing often ignore semantic features, leading to detail loss and ground point interference.<n>We propose a multi-branch two-stage 3D object detection framework using a Semantic-aware Multi-branch Sampling (SMS) module and multi-view constraints.
arXiv Detail & Related papers (2024-07-08T09:25:45Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - STNet: Spatial and Temporal feature fusion network for change detection
in remote sensing images [5.258365841490956]
We propose STNet, a remote sensing change detection network based on spatial and temporal feature fusions.
Experimental results on three benchmark datasets for RSCD demonstrate that the proposed method achieves the state-of-the-art performance.
arXiv Detail & Related papers (2023-04-22T14:40:41Z) - Adjacent-Level Feature Cross-Fusion With 3-D CNN for Remote Sensing
Image Change Detection [20.776673215108815]
We propose a novel adjacent-level feature fusion network with 3D convolution (named AFCF3D-Net)
The proposed AFCF3D-Net has been validated on the three challenging remote sensing CD datasets.
arXiv Detail & Related papers (2023-02-10T08:21:01Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
Temporal Sentence Grounding [61.57847727651068]
Temporal sentence grounding aims to localize a target segment in an untrimmed video semantically according to a given sentence query.
Most previous works focus on learning frame-level features of each whole frame in the entire video, and directly match them with the textual information.
We propose a novel Motion- and Appearance-guided 3D Semantic Reasoning Network (MA3SRN), which incorporates optical-flow-guided motion-aware, detection-based appearance-aware, and 3D-aware object-level features.
arXiv Detail & Related papers (2022-03-06T13:57:09Z) - Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive
Imaging [6.289143409131908]
Snapshot imaging (SCI) aims to record three-dimensional signals via a two-dimensional camera.
We present a novel dense deep unfolding network (DUN) with 3D-CNN prior for SCI.
In order to promote network adaption, we propose a dense feature map compressive (DFMA) module.
arXiv Detail & Related papers (2021-09-14T09:42:42Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.