AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning
- URL: http://arxiv.org/abs/2202.00067v1
- Date: Mon, 31 Jan 2022 20:02:22 GMT
- Title: AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning
- Authors: Conrad M Albrecht, Fernando Marianno, Levente J Klein
- Abstract summary: We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
- Score: 69.47585818994959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key challenge of supervised learning is the availability of human-labeled
data. We evaluate a big data processing pipeline to auto-generate labels for
remote sensing data. It is based on rasterized statistical features extracted
from surveys such as e.g. LiDAR measurements. Using simple combinations of the
rasterized statistical layers, it is demonstrated that multiple classes can be
generated at accuracies of ~0.9. As proof of concept, we utilize the big
geo-data platform IBM PAIRS to dynamically generate such labels in dense urban
areas with multiple land cover classes. The general method proposed here is
platform independent, and it can be adapted to generate labels for other
satellite modalities in order to enable machine learning on overhead imagery
for land use classification and object detection.
Related papers
- Self-Supervised Learning for User Localization [8.529237718266042]
Machine learning techniques have shown remarkable accuracy in localization tasks.
Their dependency on vast amounts of labeled data, particularly Channel State Information (CSI) and corresponding coordinates, remains a bottleneck.
We propose a pioneering approach that leverages self-supervised pretraining on unlabeled data to boost the performance of supervised learning for user localization based on CSI.
arXiv Detail & Related papers (2024-04-19T21:49:10Z) - Scalable Label-efficient Footpath Network Generation Using Remote
Sensing Data and Self-supervised Learning [7.796025683842462]
This work implements an automatic pipeline for generating footpath networks based on remote sensing images using machine learning models.
Considering supervised methods require large amounts of training data, we use a self-supervised method for feature representation learning to reduce annotation requirements.
Footpath polygons are extracted and converted to footpath networks which can be loaded and visualized by geographic information systems conveniently.
arXiv Detail & Related papers (2023-09-18T02:56:40Z) - A Benchmark Generative Probabilistic Model for Weak Supervised Learning [2.0257616108612373]
Weak Supervised Learning approaches have been developed to alleviate the annotation burden.
We show that latent variable models (PLVMs) achieve state-of-the-art performance across four datasets.
arXiv Detail & Related papers (2023-03-31T07:06:24Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - The Word is Mightier than the Label: Learning without Pointillistic
Labels using Data Programming [11.536162323162099]
Most advanced supervised Machine Learning (ML) models rely on vast amounts of point-by-point labelled training examples.
Hand-labelling vast amounts of data may be tedious, expensive, and error-prone.
arXiv Detail & Related papers (2021-08-24T19:11:28Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z) - PAIRS AutoGeo: an Automated Machine Learning Framework for Massive
Geospatial Data [7.742399489996169]
An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform.
The framework gathers required data at the location coordinates, assembles the training data, performs quality check, and trains multiple machine learning models for subsequent deployment.
This use case exemplifies how PAIRS AutoGeo enables users to leverage machine learning without extensive geospatial expertise.
arXiv Detail & Related papers (2020-12-12T21:12:41Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - Adversarial Knowledge Transfer from Unlabeled Data [62.97253639100014]
We present a novel Adversarial Knowledge Transfer framework for transferring knowledge from internet-scale unlabeled data to improve the performance of a classifier.
An important novel aspect of our method is that the unlabeled source data can be of different classes from those of the labeled target data, and there is no need to define a separate pretext task.
arXiv Detail & Related papers (2020-08-13T08:04:27Z) - Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels.
We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps.
Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.