CATSNet: a context-aware network for Height Estimation in a Forested Area based on Pol-TomoSAR data
- URL: http://arxiv.org/abs/2403.20273v1
- Date: Fri, 29 Mar 2024 16:27:40 GMT
- Title: CATSNet: a context-aware network for Height Estimation in a Forested Area based on Pol-TomoSAR data
- Authors: Wenyu Yang, Sergio Vitale, Hossein Aghababaei, Giampaolo Ferraioli, Vito Pascazio, Gilda Schirinzi,
- Abstract summary: This work defines a context-aware deep learning-based solution named CATSNet.
A convolutional neural network is considered to leverage patch-based information and extract features from a neighborhood rather than focus on a single pixel.
The experimental results show striking advantages in both performance and ability by leveraging context information within Multiple Baselines (MB) TomoSAR data across different polarimetric modalities, surpassing existing techniques.
- Score: 4.9793121278328
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tropical forests are a key component of the global carbon cycle. With plans for upcoming space-borne missions like BIOMASS to monitor forestry, several airborne missions, including TropiSAR and AfriSAR campaigns, have been successfully launched and experimented. Typical Synthetic Aperture Radar Tomography (TomoSAR) methods involve complex models with low accuracy and high computation costs. In recent years, deep learning methods have also gained attention in the TomoSAR framework, showing interesting performance. Recently, a solution based on a fully connected Tomographic Neural Network (TSNN) has demonstrated its effectiveness in accurately estimating forest and ground heights by exploiting the pixel-wise elements of the covariance matrix derived from TomoSAR data. This work instead goes beyond the pixel-wise approach to define a context-aware deep learning-based solution named CATSNet. A convolutional neural network is considered to leverage patch-based information and extract features from a neighborhood rather than focus on a single pixel. The training is conducted by considering TomoSAR data as the input and Light Detection and Ranging (LiDAR) values as the ground truth. The experimental results show striking advantages in both performance and generalization ability by leveraging context information within Multiple Baselines (MB) TomoSAR data across different polarimetric modalities, surpassing existing techniques.
Related papers
- SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs [5.961207817077044]
We propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE.
Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features.
We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling.
Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
arXiv Detail & Related papers (2024-06-30T23:11:20Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Context-Preserving Instance-Level Augmentation and Deformable
Convolution Networks for SAR Ship Detection [50.53262868498824]
Shape deformation of targets in SAR image due to random orientation and partial information loss is an essential challenge in SAR ship detection.
We propose a data augmentation method to train a deep network that is robust to partial information loss within the targets.
arXiv Detail & Related papers (2022-02-14T07:01:01Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework
for LiDAR Point Cloud Segmentation [111.56730703473411]
Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations.
Simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels.
ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning.
arXiv Detail & Related papers (2020-09-07T23:46:08Z) - Multi-Subspace Neural Network for Image Recognition [33.61205842747625]
In image classification task, feature extraction is always a big issue. Intra-class variability increases the difficulty in designing the extractors.
Recently, deep learning has drawn lots of attention on automatically learning features from data.
In this study, we proposed multi-subspace neural network (MSNN) which integrates key components of the convolutional neural network (CNN), receptive field, with subspace concept.
arXiv Detail & Related papers (2020-06-17T02:55:34Z) - A Nearest Neighbor Network to Extract Digital Terrain Models from 3D
Point Clouds [1.6249267147413524]
We present an algorithm that operates on 3D-point clouds and estimates the underlying DTM for the scene using an end-to-end approach.
Our model learns neighborhood information and seamlessly integrates this with point-wise and block-wise global features.
arXiv Detail & Related papers (2020-05-21T15:54:55Z) - Weakly-supervised land classification for coastal zone based on deep convolutional neural networks by incorporating dual-polarimetric characteristics into training dataset [1.0494061710470493]
We explore the performance of DCNNs on semantic segmentation using spaceborne polarimetric synthetic aperture radar (PolSAR) datasets.
The semantic segmentation task using PolSAR data can be categorized as weakly supervised learning when the characteristics of SAR data and data annotating procedures are factored in.
Three DCNN models, including SegNet, U-Net, and LinkNet, are implemented next.
arXiv Detail & Related papers (2020-03-30T17:32:49Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.