An Open Benchmark Dataset for GeoAI Foundation Models for Oil Palm Mapping in Indonesia
- URL: http://arxiv.org/abs/2509.08303v1
- Date: Wed, 10 Sep 2025 05:59:00 GMT
- Title: An Open Benchmark Dataset for GeoAI Foundation Models for Oil Palm Mapping in Indonesia
- Authors: M. Warizmi Wafiq, Peter Cutter, Ate Poortinga, Daniel Marc G. dela Torre, Karis Tenneson, Vanna Teck, Enikoe Bihari, Chanarun Saisaward, Weraphong Suaruang, Andrea McMahon, Andi Vika Faradiba Muin, Karno B. Batiran, Chairil A, Nurul Qomar, Arya Arismaya Metananda, David Ganz, David Saah,
- Abstract summary: Oil palm cultivation remains one of the leading causes of deforestation in Indonesia.<n>We present an open-access dataset of oil palm plantations and related land cover types in Indonesia, produced through expert labeling of high-resolution satellite imagery from 2020 to 2024.<n>The dataset provides polygon-based, wall-to-wall annotations across a range of agro-ecological zones and includes a hierarchical typology that distinguishes oil palm planting stages as well as similar perennial crops.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Oil palm cultivation remains one of the leading causes of deforestation in Indonesia. To better track and address this challenge, detailed and reliable mapping is needed to support sustainability efforts and emerging regulatory frameworks. We present an open-access geospatial dataset of oil palm plantations and related land cover types in Indonesia, produced through expert labeling of high-resolution satellite imagery from 2020 to 2024. The dataset provides polygon-based, wall-to-wall annotations across a range of agro-ecological zones and includes a hierarchical typology that distinguishes oil palm planting stages as well as similar perennial crops. Quality was ensured through multi-interpreter consensus and field validation. The dataset was created using wall-to-wall digitization over large grids, making it suitable for training and benchmarking both conventional convolutional neural networks and newer geospatial foundation models. Released under a CC-BY license, it fills a key gap in training data for remote sensing and aims to improve the accuracy of land cover types mapping. By supporting transparent monitoring of oil palm expansion, the resource contributes to global deforestation reduction goals and follows FAIR data principles.
Related papers
- Geodiffussr: Generative Terrain Texturing with Elevation Fidelity [48.82552523546255]
We introduce Geodiffussr, a flow-matching pipeline that synthesizes text-guided texture maps.<n>The core mechanism is multi-scale content aggregation (MCA): DEM features are injected into UNet blocks at multiple resolutions to enforce global-to-local elevation consistency.<n>To train and evaluate Geodiffussr, we assemble a globally distributed, biome- and climate-stratified corpus of triplets pairing SRTM-derived DEMs with Sentinel-2 imagery and vision-grounded natural-appearance captions.
arXiv Detail & Related papers (2025-11-28T09:52:44Z) - Fine-Scale Soil Mapping in Alaska with Multimodal Machine Learning [1.4786253394033289]
High-resolution soil maps are essential for characterizing permafrost distribution, identifying vulnerable areas, and informing adaptation strategies.<n>We present MISO, a vision-based machine learning (ML) model to produce statewide fine-scale soil maps for near-surface permafrost and soil taxonomy.<n>We compare MISO with Random Forest (RF), a traditional ML model that has been widely used in soil mapping applications.
arXiv Detail & Related papers (2025-06-17T22:09:48Z) - Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation [49.13393683126712]
Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities.<n> accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes.<n>We propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images.
arXiv Detail & Related papers (2025-05-21T03:57:10Z) - EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation [50.433911327489554]
We introduce EarthMapper, a novel framework for controllable satellite-map translation.<n>We also contribute CNSatMap, a large-scale dataset comprising 302,132 precisely aligned satellite-map pairs across 38 Chinese cities.<n> experiments on CNSatMap and the New York dataset demonstrate EarthMapper's superior performance.
arXiv Detail & Related papers (2025-04-28T02:41:12Z) - Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms [10.342516627774438]
We develop PRISM (Processing, Inference, and Mapping), a flexible pipeline for detecting and localizing palms in dense tropical forests using large orthomosaic images.<n>Our contributions are threefold. First, we construct a large UAV-derived orthomosaic dataset collected across 21 ecologically diverse sites in western Ecuador, annotated with 8,830 bounding boxes and 5,026 palm center points.<n>Second, we evaluate multiple state-of-the-art object detectors based on efficiency and performance, integrating zero-shot SAM 2 as the segmentation backbone. Third, we apply calibration methods to align confidence scores with IoU and explore s
arXiv Detail & Related papers (2025-02-18T16:43:11Z) - Real-Time Localization and Bimodal Point Pattern Analysis of Palms Using UAV Imagery [13.085752393960886]
We introduce PalmDSNet, a deep learning framework for real-time detection, segmentation, and counting of canopy palms.
We use UAV-captured imagery to create orthomosaics from 21 sites across western Ecuadorian tropical forests.
Expert annotations were used to create a comprehensive dataset, including 7,356 bounding boxes on image patches and 7,603 palm centers across five orthomosaics.
arXiv Detail & Related papers (2024-10-14T22:23:10Z) - Global High Categorical Resolution Land Cover Mapping via Weak Supervision [19.52604717907002]
We propose to combine fully labeled source domain and weakly labeled target domain for weakly supervised domain adaptation (WSDA)
This is beneficial as the utilization of sparse and coarse weak labels can considerably alleviate the labor required for precise and detailed land cover annotation.
We carry out high categorical resolution land cover mapping for 10 cities in different regions around the world, severally using PlanetScope, Gaofen-1, and Sentinel-2 satellite images.
arXiv Detail & Related papers (2024-06-02T23:18:12Z) - A community palm model [0.28391011428068197]
Palm oil production has been identified as one of the major drivers of deforestation for tropical countries.
To meet supply chain objectives, commodity producers and other stakeholders need timely information of land cover dynamics in their supply shed.
Here we present a "community model," a machine learning model trained on pooled data sourced from many different stakeholders.
arXiv Detail & Related papers (2024-05-01T15:18:01Z) - Deep Efficient Private Neighbor Generation for Subgraph Federated
Learning [57.39918843245229]
We propose FedDEP to tackle the challenge of incomplete information propagation on local subgraphs due to missing cross-subgraph neighbors.
FedDEP consists of a series of novel technical designs: (1) Deep neighbor generation through leveraging the GNN embeddings of potential missing neighbors; (2) Efficient pseudo-FL for neighbor generation through embedding prototyping; and (3) Privacy protection through noise-less edge-local-differential-privacy.
arXiv Detail & Related papers (2024-01-09T03:29:40Z) - GeoNet: Benchmarking Unsupervised Adaptation across Geographies [71.23141626803287]
We study the problem of geographic robustness and make three main contributions.
First, we introduce a large-scale dataset GeoNet for geographic adaptation.
Second, we hypothesize that the major source of domain shifts arise from significant variations in scene context.
Third, we conduct an extensive evaluation of several state-of-the-art unsupervised domain adaptation algorithms and architectures.
arXiv Detail & Related papers (2023-03-27T17:59:34Z) - Jalisco's multiclass land cover analysis and classification using a
novel lightweight convnet with real-world multispectral and relief data [51.715517570634994]
We present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis.
In this work, we combine three real-world open data sources to obtain 13 channels.
Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar.
arXiv Detail & Related papers (2022-01-26T14:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.