Lightweight, Pre-trained Transformers for Remote Sensing Timeseries
- URL: http://arxiv.org/abs/2304.14065v4
- Date: Mon, 5 Feb 2024 01:29:35 GMT
- Title: Lightweight, Pre-trained Transformers for Remote Sensing Timeseries
- Authors: Gabriel Tseng, Ruben Cartuyvels, Ivan Zvonkov, Mirali Purohit, David
Rolnick, Hannah Kerner
- Abstract summary: Presto is a model pre-trained on remote sensing pixel-timeseries data.
It excels at a wide variety of globally distributed remote sensing tasks and performs competitively with much larger models.
- Score: 33.44703824007848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning methods for satellite data have a range of societally
relevant applications, but labels used to train models can be difficult or
impossible to acquire. Self-supervision is a natural solution in settings with
limited labeled data, but current self-supervised models for satellite data
fail to take advantage of the characteristics of that data, including the
temporal dimension (which is critical for many applications, such as monitoring
crop growth) and availability of data from many complementary sensors (which
can significantly improve a model's predictive performance). We present Presto
(the Pretrained Remote Sensing Transformer), a model pre-trained on remote
sensing pixel-timeseries data. By designing Presto specifically for remote
sensing data, we can create a significantly smaller but performant model.
Presto excels at a wide variety of globally distributed remote sensing tasks
and performs competitively with much larger models while requiring far less
compute. Presto can be used for transfer learning or as a feature extractor for
simple models, enabling efficient deployment at scale.
Related papers
- Galileo: Learning Global and Local Features in Pretrained Remote Sensing Models [34.71460539414284]
We introduce a novel and highly effective self-supervised learning approach to learn both large- and small-scale features.
Our Galileo models obtain state-of-the-art results across diverse remote sensing tasks.
arXiv Detail & Related papers (2025-02-13T14:21:03Z) - Data-driven tool wear prediction in milling, based on a process-integrated single-sensor approach [1.6574413179773764]
This study explores data-driven methods, in particular deep learning, for tool wear prediction.
The study evaluates several machine learning models, including convolutional neural networks (CNN), long short-term memory networks (LSTM), support vector machines (SVM) and decision trees.
The ConvNeXt model has an exceptional performance, achieving a 99.1% accuracy in identifying tool wear using data from only four milling tools operated until they are worn.
arXiv Detail & Related papers (2024-12-27T23:10:32Z) - A Predictive Model Based on Transformer with Statistical Feature Embedding in Manufacturing Sensor Dataset [2.07180164747172]
This study proposes a novel predictive model based on the Transformer, utilizing statistical feature embedding and window positional encoding.
The model's performance is evaluated in two problems: fault detection and virtual metrology, showing superior results compared to baseline models.
The results support the model's applicability across various manufacturing industries, demonstrating its potential for enhancing process management and yield.
arXiv Detail & Related papers (2024-07-09T08:59:27Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization [66.27399823422665]
Device Model Generalization (DMG) is a practical yet under-investigated research topic for on-device machine learning applications.
We propose an efficient Device-cloUd collaborative parametErs generaTion framework DUET.
arXiv Detail & Related papers (2022-09-12T13:26:26Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - InPars: Data Augmentation for Information Retrieval using Large Language
Models [5.851846467503597]
In this work, we harness the few-shot capabilities of large pretrained language models as synthetic data generators for information retrieval tasks.
We show that models finetuned solely on our unsupervised dataset outperform strong baselines such as BM25.
retrievers finetuned on both supervised and our synthetic data achieve better zero-shot transfer than models finetuned only on supervised data.
arXiv Detail & Related papers (2022-02-10T16:52:45Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Transformer-Based Behavioral Representation Learning Enables Transfer
Learning for Mobile Sensing in Small Datasets [4.276883061502341]
We provide a neural architecture framework for mobile sensing data that can learn generalizable feature representations from time series.
This architecture combines benefits from CNN and Trans-former architectures to enable better prediction performance.
arXiv Detail & Related papers (2021-07-09T22:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.