Towards On-Board Panoptic Segmentation of Multispectral Satellite Images
- URL: http://arxiv.org/abs/2204.01952v1
- Date: Tue, 5 Apr 2022 03:10:39 GMT
- Title: Towards On-Board Panoptic Segmentation of Multispectral Satellite Images
- Authors: Tharindu Fernando, Clinton Fookes, Harshala Gammulle, Simon Denman,
Sridha Sridharan
- Abstract summary: We propose a lightweight pipeline for on-board panoptic segmentation of multi-spectral satellite images.
Panoptic segmentation offers major economic and environmental insights, ranging from yield estimation from agricultural lands to intelligence for complex military applications.
Our evaluations demonstrate a substantial increase in accuracy metrics compared to the existing state-of-the-art models.
- Score: 41.34294145237618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With tremendous advancements in low-power embedded computing devices and
remote sensing instruments, the traditional satellite image processing pipeline
which includes an expensive data transfer step prior to processing data on the
ground is being replaced by on-board processing of captured data. This paradigm
shift enables critical and time-sensitive analytic intelligence to be acquired
in a timely manner on-board the satellite itself. However, at present, the
on-board processing of multi-spectral satellite images is limited to
classification and segmentation tasks. Extending this processing to its next
logical level, in this paper we propose a lightweight pipeline for on-board
panoptic segmentation of multi-spectral satellite images. Panoptic segmentation
offers major economic and environmental insights, ranging from yield estimation
from agricultural lands to intelligence for complex military applications.
Nevertheless, the on-board intelligence extraction raises several challenges
due to the loss of temporal observations and the need to generate predictions
from a single image sample. To address this challenge, we propose a multimodal
teacher network based on a cross-modality attention-based fusion strategy to
improve the segmentation accuracy by exploiting data from multiple modes. We
also propose an online knowledge distillation framework to transfer the
knowledge learned by this multi-modal teacher network to a uni-modal student
which receives only a single frame input, and is more appropriate for an
on-board environment. We benchmark our approach against existing
state-of-the-art panoptic segmentation models using the PASTIS multi-spectral
panoptic segmentation dataset considering an on-board processing setting. Our
evaluations demonstrate a substantial increase in accuracy metrics compared to
the existing state-of-the-art models.
Related papers
- Enhancing Maritime Situational Awareness through End-to-End Onboard Raw Data Analysis [4.441792803766689]
This research presents a framework addressing the strict bandwidth, energy, and latency constraints of small satellites.
It investigates the application of deep learning techniques for direct ship detection and classification from raw satellite imagery.
By simplifying the onboard processing chain, our approach facilitates direct analyses without requiring computationally intensive steps such as calibration and ortho-rectification.
arXiv Detail & Related papers (2024-11-05T18:38:42Z) - Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery [4.499833362998487]
This study explores the effectiveness of a Cut-and-Paste augmentation technique for semantic segmentation in satellite images.
We adapt this augmentation, which usually requires labeled instances, to the case of semantic segmentation.
Using the DynamicEarthNet dataset and a U-Net model for evaluation, we found that this augmentation significantly enhances the mIoU score on the test set from 37.9 to 44.1.
arXiv Detail & Related papers (2024-04-08T17:18:30Z) - SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks.
To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Revisiting the Encoding of Satellite Image Time Series [2.5874041837241304]
Image Time Series (SITS)temporal learning is complex due to hightemporal resolutions and irregular acquisition times.
We develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders.
We attain new state-of-the-art (SOTA) results on the Satellite PASTIS benchmark dataset.
arXiv Detail & Related papers (2023-05-03T12:44:20Z) - Unsupervised Discovery of Semantic Concepts in Satellite Imagery with
Style-based Wavelet-driven Generative Models [27.62417543307831]
We present the first pre-trained style- and wavelet-based GAN model that can synthesize a wide gamut of realistic satellite images.
We show that by analyzing the intermediate activations of our network, one can discover a multitude of interpretable semantic directions.
arXiv Detail & Related papers (2022-08-03T14:19:24Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.