SAR2Struct: Extracting 3D Semantic Structural Representation of Aircraft Targets from Single-View SAR Image
- URL: http://arxiv.org/abs/2506.06757v1
- Date: Sat, 07 Jun 2025 10:57:30 GMT
- Title: SAR2Struct: Extracting 3D Semantic Structural Representation of Aircraft Targets from Single-View SAR Image
- Authors: Ziyu Yue, Ruixi You, Feng Xu,
- Abstract summary: This paper proposes a novel task: SAR target structure recovery.<n>It aims to infer the components of a target and the structural relationships between its components from a single-view SAR image.<n> Experimental results validated the effectiveness of each step and demonstrated, for the first time, that 3D semantic structural representation of aircraft targets can be directly derived from a single-view SAR image.
- Score: 5.476386749283056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To translate synthetic aperture radar (SAR) image into interpretable forms for human understanding is the ultimate goal of SAR advanced information retrieval. Existing methods mainly focus on 3D surface reconstruction or local geometric feature extraction of targets, neglecting the role of structural modeling in capturing semantic information. This paper proposes a novel task: SAR target structure recovery, which aims to infer the components of a target and the structural relationships between its components, specifically symmetry and adjacency, from a single-view SAR image. Through learning the structural consistency and geometric diversity across the same type of targets as observed in different SAR images, it aims to derive the semantic representation of target directly from its 2D SAR image. To solve this challenging task, a two-step algorithmic framework based on structural descriptors is developed. Specifically, in the training phase, it first detects 2D keypoints from real SAR images, and then learns the mapping from these keypoints to 3D hierarchical structures using simulated data. During the testing phase, these two steps are integrated to infer the 3D structure from real SAR images. Experimental results validated the effectiveness of each step and demonstrated, for the first time, that 3D semantic structural representation of aircraft targets can be directly derived from a single-view SAR image.
Related papers
- SAB3R: Semantic-Augmented Backbone in 3D Reconstruction [19.236494823612507]
We introduce a new task, Map and Locate, which unifies the objectives of open-vocabulary segmentation and 3D reconstruction.<n>Specifically, Map and Locate involves generating a point cloud from an unposed video and segmenting object instances based on open-vocabulary queries.<n>This task serves as a critical step toward real-world embodied AI applications and introduces a practical task that bridges reconstruction, recognition and reorganization.
arXiv Detail & Related papers (2025-06-02T18:00:04Z) - From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation [3.0477617036157136]
High-fidelity 3D reconstruction is critical for aerial inspection tasks such as infrastructure monitoring, structural assessment, and environmental surveying.<n>While traditional photogrammetry techniques enable geometric modeling, they lack semantic interpretability, limiting their effectiveness for automated inspection.<n>Recent advances in neural rendering and 3D Gaussian Splatting (3DGS) offer efficient, photorealistic reconstructions but similarly lack scene-level understanding.<n>We present a UAV-based pipeline that extends Feature-3DGS for language-guided 3D segmentation.
arXiv Detail & Related papers (2025-05-23T02:35:46Z) - Structure-Aware Correspondence Learning for Relative Pose Estimation [65.44234975976451]
Relative pose estimation provides a promising way for achieving object-agnostic pose estimation.<n>Existing 3D correspondence-based methods suffer from small overlaps in visible regions and unreliable feature estimation for invisible regions.<n>We propose a novel Structure-Aware Correspondence Learning method for Relative Pose Estimation, which consists of two key modules.
arXiv Detail & Related papers (2025-03-24T13:43:44Z) - Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - CMAR-Net: Accurate Cross-Modal 3D SAR Reconstruction of Vehicle Targets with Sparse-Aspect Multi-Baseline Data [6.046251485192556]
Sparse-aspect multi-baseline Synthetic Aperture Radar (SAR) 3D tomography is a crucial remote sensing technique.<n>Deep learning (DL) revolutionizes this field through its powerful data-driven representation capabilities and efficient inference characteristics.<n>We propose a Cross-Modal 3D-SAR Reconstruction Network (CMAR-Net) that enhances 3D SAR imaging by fusing heterogeneous information.
arXiv Detail & Related papers (2024-06-06T15:18:59Z) - Differentiable SAR Renderer and SAR Target Reconstruction [7.840247953745616]
A differentiable SAR (DSR) is developed which reformulates the mapping and projection of SAR imaging mechanism.
A 3D inverse target reconstruction algorithm from SAR images is devised.
arXiv Detail & Related papers (2022-05-14T17:24:32Z) - 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow [61.62796058294777]
Reconstructing 3D shape from a single 2D image is a challenging task.
Most of the previous methods still struggle to extract semantic attributes for 3D reconstruction task.
We propose 3DAttriFlow to disentangle and extract semantic attributes through different semantic levels in the input images.
arXiv Detail & Related papers (2022-03-29T02:03:31Z) - S2R-DepthNet: Learning a Generalizable Depth-specific Structural
Representation [63.58891781246175]
Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes.
We are the first to explore the learning of a depth-specific structural representation, which captures the essential feature for depth estimation and ignores irrelevant style information.
Our S2R-DepthNet can be well generalized to unseen real-world data directly even though it is only trained on synthetic data.
arXiv Detail & Related papers (2021-04-02T03:55:41Z) - PeaceGAN: A GAN-based Multi-Task Learning Method for SAR Target Image
Generation with a Pose Estimator and an Auxiliary Classifier [50.17500790309477]
We propose a novel GAN-based multi-task learning (MTL) method for SAR target image generation, called PeaceGAN.
PeaceGAN uses both pose angle and target class information, which makes it possible to produce SAR target images of desired target classes at intended pose angles.
arXiv Detail & Related papers (2021-03-29T10:03:09Z) - Enabling Visual Action Planning for Object Manipulation through Latent
Space Roadmap [72.01609575400498]
We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces.
We propose a Latent Space Roadmap (LSR) for task planning, a graph-based structure capturing globally the system dynamics in a low-dimensional latent space.
We present a thorough investigation of our framework on two simulated box stacking tasks and a folding task executed on a real robot.
arXiv Detail & Related papers (2021-03-03T17:48:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.