Panoptic Segmentation of Satellite Image Time Series with Convolutional
Temporal Attention Networks
- URL: http://arxiv.org/abs/2107.07933v1
- Date: Fri, 16 Jul 2021 14:49:01 GMT
- Title: Panoptic Segmentation of Satellite Image Time Series with Convolutional
Temporal Attention Networks
- Authors: Vivien Sainte Fare Garnot and Loic Landrieu
- Abstract summary: We present the first end-to-end, single-stage method for panoptic segmentation of Satellite Image Time Series (SITS)
This module can be combined with our novel image sequence encoding network which relies on temporal self-attention to extract rich and adaptive multi-scale-temporal features.
We also introduce PASTIS, the first open-access SITS dataset with panoptic annotations.
- Score: 7.00422423634143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unprecedented access to multi-temporal satellite imagery has opened new
perspectives for a variety of Earth observation tasks. Among them,
pixel-precise panoptic segmentation of agricultural parcels has major economic
and environmental implications. While researchers have explored this problem
for single images, we argue that the complex temporal patterns of crop
phenology are better addressed with temporal sequences of images. In this
paper, we present the first end-to-end, single-stage method for panoptic
segmentation of Satellite Image Time Series (SITS). This module can be combined
with our novel image sequence encoding network which relies on temporal
self-attention to extract rich and adaptive multi-scale spatio-temporal
features. We also introduce PASTIS, the first open-access SITS dataset with
panoptic annotations. We demonstrate the superiority of our encoder for
semantic segmentation against multiple competing architectures, and set up the
first state-of-the-art of panoptic segmentation of SITS. Our implementation and
PASTIS are publicly available.
Related papers
- Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation [11.193770734116981]
This paper embraces the weakly supervised paradigm (i.e., only image-level categories available) to liberate the crop mapping task from the exhaustive annotation burden.
We propose a novel method, termed exploring space-time perceptive clues (Exact)
Our method demonstrates impressive performance on various SITS benchmarks.
arXiv Detail & Related papers (2024-12-05T08:37:56Z) - Image Segmentation in Foundation Model Era: A Survey [95.60054312319939]
Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions.
This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation.
An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
arXiv Detail & Related papers (2024-08-23T10:07:59Z) - Revisiting the Encoding of Satellite Image Time Series [2.5874041837241304]
Image Time Series (SITS)temporal learning is complex due to hightemporal resolutions and irregular acquisition times.
We develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders.
We attain new state-of-the-art (SOTA) results on the Satellite PASTIS benchmark dataset.
arXiv Detail & Related papers (2023-05-03T12:44:20Z) - ViTs for SITS: Vision Transformers for Satellite Image Time Series [52.012084080257544]
We introduce a fully-attentional model for general Satellite Image Time Series (SITS) processing based on the Vision Transformer (ViT)
TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder.
arXiv Detail & Related papers (2023-01-12T11:33:07Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - Multi-Modal Temporal Attention Models for Crop Mapping from Satellite
Time Series [7.379078963413671]
Motivated by the recent success of temporal attention-based methods across multiple crop mapping tasks, we propose to investigate how these models can be adapted to operate on several modalities.
We implement and evaluate multiple fusion schemes, including a novel approach and simple adjustments to the training procedure.
We show that most fusion schemes have advantages and drawbacks, making them relevant for specific settings.
We then evaluate the benefit of multimodality across several tasks: parcel classification, pixel-based segmentation, and panoptic parcel segmentation.
arXiv Detail & Related papers (2021-12-14T17:05:55Z) - Spatiotemporal Inconsistency Learning for DeepFake Video Detection [51.747219106855624]
We present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions.
And the ISM simultaneously utilizes the spatial information from SIM and temporal information from TIM to establish a more comprehensive spatial-temporal representation.
arXiv Detail & Related papers (2021-09-04T13:05:37Z) - Street-view Panoramic Video Synthesis from a Single Satellite Image [92.26826861266784]
We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video.
Existing cross-view synthesis approaches focus more on images, while video synthesis in such a case has not yet received enough attention.
arXiv Detail & Related papers (2020-12-11T20:22:38Z) - Crowdsampling the Plenoptic Function [56.10020793913216]
We present a new approach to novel view synthesis under time-varying illumination from such data.
We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function.
Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time.
arXiv Detail & Related papers (2020-07-30T02:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.