PAIRS AutoGeo: an Automated Machine Learning Framework for Massive
Geospatial Data
- URL: http://arxiv.org/abs/2012.06907v1
- Date: Sat, 12 Dec 2020 21:12:41 GMT
- Title: PAIRS AutoGeo: an Automated Machine Learning Framework for Massive
Geospatial Data
- Authors: Wang Zhou, Levente J. Klein, Siyuan Lu
- Abstract summary: An automated machine learning framework for geospatial data named PAIRS AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform.
The framework gathers required data at the location coordinates, assembles the training data, performs quality check, and trains multiple machine learning models for subsequent deployment.
This use case exemplifies how PAIRS AutoGeo enables users to leverage machine learning without extensive geospatial expertise.
- Score: 7.742399489996169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An automated machine learning framework for geospatial data named PAIRS
AutoGeo is introduced on IBM PAIRS Geoscope big data and analytics platform.
The framework simplifies the development of industrial machine learning
solutions leveraging geospatial data to the extent that the user inputs are
minimized to merely a text file containing labeled GPS coordinates. PAIRS
AutoGeo automatically gathers required data at the location coordinates,
assembles the training data, performs quality check, and trains multiple
machine learning models for subsequent deployment. The framework is validated
using a realistic industrial use case of tree species classification.
Open-source tree species data are used as the input to train a random forest
classifier and a modified ResNet model for 10-way tree species classification
based on aerial imagery, which leads to an accuracy of $59.8\%$ and $81.4\%$,
respectively. This use case exemplifies how PAIRS AutoGeo enables users to
leverage machine learning without extensive geospatial expertise.
Related papers
- Geo-FuB: A Method for Constructing an Operator-Function Knowledge Base for Geospatial Code Generation Tasks Using Large Language Models [0.5242869847419834]
This study introduces a framework to construct such a knowledge base, leveraging geospatial script semantics.
An example knowledge base, Geo-FuB, built from 154,075 Google Earth Engine scripts, is available on GitHub.
arXiv Detail & Related papers (2024-10-28T12:50:27Z) - An Autonomous GIS Agent Framework for Geospatial Data Retrieval [0.0]
This study proposes an autonomous GIS agent framework capable of retrieving required geospatial data.
We developed a prototype agent based on the framework, released as a QGIS plugin (GeoData Retrieve Agent) and a Python program.
Experiment results demonstrate its capability of retrieving data from various sources including OpenStreetMap, administrative boundaries and demographic data from the US Census Bureau.
arXiv Detail & Related papers (2024-07-13T14:23:57Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - Scalable Label-efficient Footpath Network Generation Using Remote
Sensing Data and Self-supervised Learning [7.796025683842462]
This work implements an automatic pipeline for generating footpath networks based on remote sensing images using machine learning models.
Considering supervised methods require large amounts of training data, we use a self-supervised method for feature representation learning to reduce annotation requirements.
Footpath polygons are extracted and converted to footpath networks which can be loaded and visualized by geographic information systems conveniently.
arXiv Detail & Related papers (2023-09-18T02:56:40Z) - GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark [56.08664336835741]
We propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE.
We collect data from open-released geographic resources and introduce six natural language understanding tasks.
We pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
arXiv Detail & Related papers (2023-05-11T03:21:56Z) - Satellite Image Time Series Analysis for Big Earth Observation Data [50.591267188664666]
This paper describes sits, an open-source R package for satellite image time series analysis using machine learning.
We show that this approach produces high accuracy for land use and land cover maps through a case study in the Cerrado biome.
arXiv Detail & Related papers (2022-04-24T15:23:25Z) - AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning [69.47585818994959]
We evaluate a big data processing pipeline to auto-generate labels for remote sensing data.
We utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas.
arXiv Detail & Related papers (2022-01-31T20:02:22Z) - TorchGeo: deep learning with geospatial data [24.789143032205736]
We introduce TorchGeo, a Python library for integrating geospatial data into the PyTorch deep learning ecosystem.
TorchGeo provides benchmark datasets, composable datasets for generic geospatial data sources, samplers for geospatial data, and transforms that work with multispectral imagery.
TorchGeo is also the first library to provide pre-trained models for multispectral satellite imagery.
arXiv Detail & Related papers (2021-11-17T02:47:33Z) - DeepSatData: Building large scale datasets of satellite images for
training machine learning models [77.17638664503215]
This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models.
We discuss issues faced from the point of view of deep neural network training and evaluation.
arXiv Detail & Related papers (2021-04-28T15:13:12Z) - PyODDS: An End-to-end Outlier Detection System with Automated Machine
Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support.
Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space.
It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.