Taking it further: leveraging pseudo labels for field delineation across
label-scarce smallholder regions
- URL: http://arxiv.org/abs/2312.08384v1
- Date: Tue, 12 Dec 2023 08:39:07 GMT
- Title: Taking it further: leveraging pseudo labels for field delineation across
label-scarce smallholder regions
- Authors: Philippe Rufin, Sherrie Wang, S\'a Nogueira Lisboa, Jan Hemmerling,
Mirela G. Tulbure, Patrick Meyfroidt
- Abstract summary: This study explores opportunities of using sparse field delineation pseudo labels for fine-tuning models across geographies and sensor characteristics.
We build on a FracTAL ResUNet trained for crop field delineation in India (median field size of 0.24 ha) and use this pre-trained model to generate pseudo labels in Mozambique.
We then used the human-annotated labels and the pseudo labels for model fine-tuning and compared predictions against human field annotations.
- Score: 2.8120847363433654
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Transfer learning allows for resource-efficient geographic transfer of
pre-trained field delineation models. However, the scarcity of labeled data for
complex and dynamic smallholder landscapes, particularly in Sub-Saharan Africa,
remains a major bottleneck for large-area field delineation. This study
explores opportunities of using sparse field delineation pseudo labels for
fine-tuning models across geographies and sensor characteristics. We build on a
FracTAL ResUNet trained for crop field delineation in India (median field size
of 0.24 ha) and use this pre-trained model to generate pseudo labels in
Mozambique (median field size of 0.06 ha). We designed multiple pseudo label
selection strategies and compared the quantities, area properties, seasonal
distribution, and spatial agreement of the pseudo labels against
human-annotated training labels (n = 1,512). We then used the human-annotated
labels and the pseudo labels for model fine-tuning and compared predictions
against human field annotations (n = 2,199). Our results indicate i) a good
baseline performance of the pre-trained model in both field delineation and
field size estimation, and ii) the added value of regional fine-tuning with
performance improvements in nearly all experiments. Moreover, we found iii)
substantial performance increases when using only pseudo labels (up to 77% of
the IoU increases and 68% of the RMSE decreases obtained by human labels), and
iv) additional performance increases when complementing human annotations with
pseudo labels. Pseudo labels can be efficiently generated at scale and thus
facilitate domain adaptation in label-scarce settings. The workflow presented
here is a stepping stone for overcoming the persisting data gaps in
heterogeneous smallholder agriculture of Sub-Saharan Africa, where labels are
commonly scarce.
Related papers
- A region-wide, multi-year set of crop field boundary labels for Africa [0.0]
We delineated field boundaries in 33,746 Planet images captured between 2017 and 2023 across the continent.
Quality metrics showed that label quality was moderately high (0.75) for measures of total field extent, but low regarding the number of individual fields delineated.
This sample provides valuable insight into regional agricultural characteristics, highlighting variations in the median field size and density.
arXiv Detail & Related papers (2024-12-24T15:14:58Z) - Reducing Labeling Costs in Sentiment Analysis via Semi-Supervised Learning [0.0]
This study explores label propagation in semi-supervised learning.
We employ a transductive label propagation method based on the manifold assumption for text classification.
By extending labels based on cosine proximity within a nearest neighbor graph from network embeddings, we combine unlabeled data into supervised learning.
arXiv Detail & Related papers (2024-10-15T07:25:33Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Weak Labeling for Cropland Mapping in Africa [3.5759681393339697]
Cropland mapping can play a vital role in addressing environmental, agricultural, and food security challenges.
In Africa, practical applications are often hindered by the limited availability of high-resolution cropland maps.
We propose an approach that utilizes unsupervised object clustering to refine existing weak labels.
arXiv Detail & Related papers (2024-01-13T08:45:41Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Semi-Supervised Learning with Taxonomic Labels [42.02670649470055]
We propose techniques to incorporate coarse taxonomic labels to train image classifiers in fine-grained domains.
On the Semi-iNat dataset consisting of 810 species across three Kingdoms, incorporating Phylum labels improves the Species level classification accuracy by 6%.
We propose a technique to select relevant data from a large collection of unlabeled images guided by the hierarchy which improves the robustness.
arXiv Detail & Related papers (2021-11-23T00:50:25Z) - An Empirical Investigation of Learning from Biased Toxicity Labels [15.822714574671412]
We study how different training strategies can leverage a small dataset of human-annotated labels and a large but noisy dataset of synthetically generated labels.
We evaluate the accuracy and fairness properties of these approaches, and trade-offs between the two.
arXiv Detail & Related papers (2021-10-04T17:19:57Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z) - Domain Adaptive Semantic Segmentation Using Weak Labels [115.16029641181669]
We propose a novel framework for domain adaptation in semantic segmentation with image-level weak labels in the target domain.
We develop a weak-label classification module to enforce the network to attend to certain categories.
In experiments, we show considerable improvements with respect to the existing state-of-the-arts in UDA and present a new benchmark in the WDA setting.
arXiv Detail & Related papers (2020-07-30T01:33:57Z) - Semi-Automatic Data Annotation guided by Feature Space Projection [117.9296191012968]
We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation.
We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities.
Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
arXiv Detail & Related papers (2020-07-27T17:03:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.