Multi-Modal Temporal Attention Models for Crop Mapping from Satellite
Time Series
- URL: http://arxiv.org/abs/2112.07558v1
- Date: Tue, 14 Dec 2021 17:05:55 GMT
- Title: Multi-Modal Temporal Attention Models for Crop Mapping from Satellite
Time Series
- Authors: Vivien Sainte Fare Garnot and Loic Landrieu and Nesrine Chehata
- Abstract summary: Motivated by the recent success of temporal attention-based methods across multiple crop mapping tasks, we propose to investigate how these models can be adapted to operate on several modalities.
We implement and evaluate multiple fusion schemes, including a novel approach and simple adjustments to the training procedure.
We show that most fusion schemes have advantages and drawbacks, making them relevant for specific settings.
We then evaluate the benefit of multimodality across several tasks: parcel classification, pixel-based segmentation, and panoptic parcel segmentation.
- Score: 7.379078963413671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optical and radar satellite time series are synergetic: optical images
contain rich spectral information, while C-band radar captures useful
geometrical information and is immune to cloud cover. Motivated by the recent
success of temporal attention-based methods across multiple crop mapping tasks,
we propose to investigate how these models can be adapted to operate on several
modalities. We implement and evaluate multiple fusion schemes, including a
novel approach and simple adjustments to the training procedure, significantly
improving performance and efficiency with little added complexity. We show that
most fusion schemes have advantages and drawbacks, making them relevant for
specific settings. We then evaluate the benefit of multimodality across several
tasks: parcel classification, pixel-based segmentation, and panoptic parcel
segmentation. We show that by leveraging both optical and radar time series,
multimodal temporal attention-based models can outmatch single-modality models
in terms of performance and resilience to cloud cover. To conduct these
experiments, we augment the PASTIS dataset with spatially aligned radar image
time series. The resulting dataset, PASTIS-R, constitutes the first
large-scale, multimodal, and open-access satellite time series dataset with
semantic and instance annotations.
Related papers
- TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation [80.13343299606146]
We propose a Temporal LiDAR Aggregation and Distillation (TLAD) algorithm, which leverages historical priors to assign different aggregation steps for different classes.
To make full use of temporal images, we design a Temporal Image Aggregation and Fusion (TIAF) module, which can greatly expand the camera FOV.
We also develop a Static-Moving Switch Augmentation (SMSA) algorithm, which utilizes sufficient temporal information to enable objects to switch their motion states freely.
arXiv Detail & Related papers (2024-07-13T03:00:16Z) - TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification [13.110156202816112]
We propose a novel multi-view approach integrating frequency-domain and time-domain features to provide complementary contexts for time series classification.
Our method fuses continuous wavelet transform spectral features with temporal convolutional or multilayer perceptron features.
Experiments on 10 standard benchmark datasets demonstrate our approach achieves an average 6.45% accuracy improvement over state-of-the-art TSC models.
arXiv Detail & Related papers (2024-06-06T18:05:10Z) - Temporal Embeddings: Scalable Self-Supervised Temporal Representation
Learning from Spatiotemporal Data for Multimodal Computer Vision [1.4127889233510498]
A novel approach is proposed to stratify landscape based on mobility activity time series.
The pixel-wise embeddings are converted to image-like channels that can be used for task-based, multimodal modeling.
arXiv Detail & Related papers (2023-10-16T02:53:29Z) - Diffusion Models for Interferometric Satellite Aperture Radar [73.01013149014865]
Probabilistic Diffusion Models (PDMs) have recently emerged as a very promising class of generative models.
Here, we leverage PDMs to generate several radar-based satellite image datasets.
We show that PDMs succeed in generating images with complex and realistic structures, but that sampling time remains an issue.
arXiv Detail & Related papers (2023-08-31T16:26:17Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - ViTs for SITS: Vision Transformers for Satellite Image Time Series [52.012084080257544]
We introduce a fully-attentional model for general Satellite Image Time Series (SITS) processing based on the Vision Transformer (ViT)
TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder.
arXiv Detail & Related papers (2023-01-12T11:33:07Z) - RapidAI4EO: Mono- and Multi-temporal Deep Learning models for Updating
the CORINE Land Cover Product [0.36265845593635804]
We evaluate the performance of multi-temporal (monthly time series) compared to mono-temporal (single time step) satellite images for multi-label classification.
We incorporated time-series images using a LSTM model to assess whether or not multi-temporal signals from satellites improves CLC classification.
arXiv Detail & Related papers (2022-10-26T11:08:13Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.