DeepSatData: Building large scale datasets of satellite images for
training machine learning models
- URL: http://arxiv.org/abs/2104.13824v1
- Date: Wed, 28 Apr 2021 15:13:12 GMT
- Title: DeepSatData: Building large scale datasets of satellite images for
training machine learning models
- Authors: Michail Tarasiou, Stefanos Zafeiriou
- Abstract summary: This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models.
We discuss issues faced from the point of view of deep neural network training and evaluation.
- Score: 77.17638664503215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This report presents design considerations for automatically generating
satellite imagery datasets for training machine learning models with emphasis
placed on dense classification tasks, e.g. semantic segmentation. The
implementation presented makes use of freely available Sentinel-2 data which
allows generation of large scale datasets required for training deep neural
networks. We discuss issues faced from the point of view of deep neural network
training and evaluation such as checking the quality of ground truth data and
comment on the scalability of the approach. Accompanying code is provided in
https://github.com/michaeltrs/DeepSatData.
Related papers
- AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud
Registration [69.21282992341007]
Auto Synth automatically generates 3D training data for point cloud registration.
We replace the point cloud registration network with a much smaller surrogate network, leading to a $4056.43$ speedup.
Our results on TUD-L, LINEMOD and Occluded-LINEMOD evidence that a neural network trained on our searched dataset yields consistently better performance than the same one trained on the widely used ModelNet40 dataset.
arXiv Detail & Related papers (2023-09-20T09:29:44Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Defect Classification in Additive Manufacturing Using CNN-Based Vision
Processing [76.72662577101988]
This paper examines two scenarios: first, using convolutional neural networks (CNNs) to accurately classify defects in an image dataset from AM and second, applying active learning techniques to the developed classification model.
This allows the construction of a human-in-the-loop mechanism to reduce the size of the data required to train and generate training data.
arXiv Detail & Related papers (2023-07-14T14:36:58Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - Generating synthetic photogrammetric data for training deep learning
based 3D point cloud segmentation models [0.0]
At I/ITSEC 2019, the authors presented a fully-automated workflow to segment 3D photogrammetric point-clouds/meshes and extract object information.
The ultimate goal is to create realistic virtual environments and provide the necessary information for simulation.
arXiv Detail & Related papers (2020-08-21T18:50:42Z) - SatImNet: Structured and Harmonised Training Data for Enhanced Satellite
Imagery Classification [0.32228025627337864]
We describe procedures of open-source training data management, integration, and data retrieval.
We propose SatImNet, a collection of open training data, structured and harmonized according to specific rules.
Two modelling approaches based on convolutional neural networks have been designed and configured to deal with satellite image classification and segmentation.
arXiv Detail & Related papers (2020-06-18T15:46:24Z) - Dataset Condensation with Gradient Matching [36.14340188365505]
We propose a training set synthesis technique for data-efficient learning, called dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch.
We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-10T16:30:52Z) - Neural Data Server: A Large-Scale Search Engine for Transfer Learning
Data [78.74367441804183]
We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain.
NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client.
We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets.
arXiv Detail & Related papers (2020-01-09T01:21:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.