Related papers: A Joint Learning Framework with Feature Reconstruction and Prediction for Incomplete Satellite Image Time Series in Agricultural Semantic Segmentation

A Joint Learning Framework with Feature Reconstruction and Prediction for Incomplete Satellite Image Time Series in Agricultural Semantic Segmentation

URL: http://arxiv.org/abs/2505.19159v1
Date: Sun, 25 May 2025 14:15:47 GMT
Title: A Joint Learning Framework with Feature Reconstruction and Prediction for Incomplete Satellite Image Time Series in Agricultural Semantic Segmentation
Authors: Yuze Wang, Mariana Belgiu, Haiyang Wu, Dandan Zhong, Yangyang Cao, Chao Tao,
Abstract summary: We propose a joint learning framework with feature reconstruction and prediction to address incomplete SITS.<n>We show that our method improves mean F1-scores by 6.93% in cropland extraction and 7.09% in crop classification over baselines.
Score: 3.808725321596432
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Satellite Image Time Series (SITS) is crucial for agricultural semantic segmentation. However, Cloud contamination introduces time gaps in SITS, disrupting temporal dependencies and causing feature shifts, leading to degraded performance of models trained on complete SITS. Existing methods typically address this by reconstructing the entire SITS before prediction or using data augmentation to simulate missing data. Yet, full reconstruction may introduce noise and redundancy, while the data-augmented model can only handle limited missing patterns, leading to poor generalization. We propose a joint learning framework with feature reconstruction and prediction to address incomplete SITS more effectively. During training, we simulate data-missing scenarios using temporal masks. The two tasks are guided by both ground-truth labels and the teacher model trained on complete SITS. The prediction task constrains the model from selectively reconstructing critical features from masked inputs that align with the teacher's temporal feature representations. It reduces unnecessary reconstruction and limits noise propagation. By integrating reconstructed features into the prediction task, the model avoids learning shortcuts and maintains its ability to handle varied missing patterns and complete SITS. Experiments on SITS from Hunan Province, Western France, and Catalonia show that our method improves mean F1-scores by 6.93% in cropland extraction and 7.09% in crop classification over baselines. It also generalizes well across satellite sensors, including Sentinel-2 and PlanetScope, under varying temporal missing rates and model backbones.

Related papers

Context-Informed Grounding Supervision [102.11698329887226]
Context-INformed Grounding Supervision (CINGS) is a post-training supervision in which the model is trained with relevant context prepended to the response.<n>Our experiments demonstrate that models trained with CINGS exhibit stronger grounding in both textual and visual domains.
arXiv Detail & Related papers (2025-06-18T14:13:56Z)
TSPulse: Dual Space Tiny Pre-Trained Models for Rapid Time-Series Analysis [12.034816114258803]
TSPulse is an ultra-compact time-series pre-trained model with only 1M parameters.<n>It performs strongly across classification, anomaly detection, imputation, and retrieval tasks.<n>Results are achieved with just 1M parameters, making TSPulse 10-100X smaller than existing pre-trained models.
arXiv Detail & Related papers (2025-05-19T12:18:53Z)
Self-supervised Spatial-Temporal Learner for Precipitation Nowcasting [5.365086662531667]
Short-term prediction of weather is essential for making timely and weather-dependent decisions.<n>In this work, we leverage the benefits of self-supervised learning and integrate it with spatial-temporal learning to develop a novel model, SpaT-SparK.
arXiv Detail & Related papers (2024-12-20T14:09:36Z)
Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation [11.193770734116981]
This paper embraces the weakly supervised paradigm (i.e., only image-level categories available) to liberate the crop mapping task from the exhaustive annotation burden.<n>We propose a novel method, termed exploring space-time perceptive clues (Exact)<n>Our method demonstrates impressive performance on various SITS benchmarks.
arXiv Detail & Related papers (2024-12-05T08:37:56Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
Time Series Representation Models [2.724184832774005]
Time series analysis remains a major challenge due to its sparse characteristics, high dimensionality, and inconsistent data quality. Recent advancements in transformer-based techniques have enhanced capabilities in forecasting and imputation. We propose a new architectural concept for time series analysis based on introspection.
arXiv Detail & Related papers (2024-05-28T13:25:31Z)
Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies. Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z)
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning [59.26623999209235]
We present DiST, which disentangles the learning of spatial and temporal aspects of videos. The disentangled learning in DiST is highly efficient because it avoids the back-propagation of massive pre-trained parameters. Extensive experiments on five benchmarks show that DiST delivers better performance than existing state-of-the-art methods by convincing gaps.
arXiv Detail & Related papers (2023-09-14T17:58:33Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
Revisiting the Encoding of Satellite Image Time Series [2.5874041837241304]
Image Time Series (SITS)temporal learning is complex due to hightemporal resolutions and irregular acquisition times. We develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders. We attain new state-of-the-art (SOTA) results on the Satellite PASTIS benchmark dataset.
arXiv Detail & Related papers (2023-05-03T12:44:20Z)
DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models [53.67562579184457]
This paper focuses on probabilistic STG forecasting, which is challenging due to the difficulty in modeling uncertainties and complex dependencies. We present the first attempt to generalize the popular denoising diffusion models to STGs, leading to a novel non-autoregressive framework called DiffSTG. Our approach combines the intrinsic-temporal learning capabilities STNNs with the uncertainty measurements of diffusion models.
arXiv Detail & Related papers (2023-01-31T13:42:36Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.