Enabling Plant Phenotyping in Weedy Environments using Multi-Modal Imagery via Synthetic and Generated Training Data
- URL: http://arxiv.org/abs/2509.19208v1
- Date: Tue, 23 Sep 2025 16:29:13 GMT
- Title: Enabling Plant Phenotyping in Weedy Environments using Multi-Modal Imagery via Synthetic and Generated Training Data
- Authors: Earl Ranario, Ismael Mayanja, Heesup Yun, Brian N. Bailey, J. Mason Earles,
- Abstract summary: We trained models on 1,128 synthetic images containing complex mixtures of crop and weed plants.<n>When combining all the synthetic images with a few labeled real images, we observed a maximum relative improvement of 22% for the weed class and 17% for the plant class.
- Score: 5.6545322206246516
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate plant segmentation in thermal imagery remains a significant challenge for high throughput field phenotyping, particularly in outdoor environments where low contrast between plants and weeds and frequent occlusions hinder performance. To address this, we present a framework that leverages synthetic RGB imagery, a limited set of real annotations, and GAN-based cross-modality alignment to enhance semantic segmentation in thermal images. We trained models on 1,128 synthetic images containing complex mixtures of crop and weed plants in order to generate image segmentation masks for crop and weed plants. We additionally evaluated the benefit of integrating as few as five real, manually segmented field images within the training process using various sampling strategies. When combining all the synthetic images with a few labeled real images, we observed a maximum relative improvement of 22% for the weed class and 17% for the plant class compared to the full real-data baseline. Cross-modal alignment was enabled by translating RGB to thermal using CycleGAN-turbo, allowing robust template matching without calibration. Results demonstrated that combining synthetic data with limited manual annotations and cross-domain translation via generative models can significantly boost segmentation performance in complex field environments for multi-model imagery.
Related papers
- Generative diffusion models for agricultural AI: plant image generation, indoor-to-outdoor translation, and expert preference alignment [0.683514883811771]
The success of agricultural artificial intelligence depends heavily on large, diverse, and high-quality plant image datasets.<n>This paper investigates diffusion-based generative modeling to address these challenges through plant image synthesis, indoor-to-outdoor translation, and expert preference aligned fine tuning.
arXiv Detail & Related papers (2025-12-22T18:07:08Z) - Synthetic Crop-Weed Image Generation and its Impact on Model Generalization [0.8849672280563691]
We present a pipeline for procedural generation of synthetic crop-weed images using Blender.<n>We benchmark several state-of-the-art segmentation models on synthetic and real datasets.<n>Our results show that training on synthetic images leads to a sim-to-real gap of 10%, surpassing previous state-of-the-art methods.
arXiv Detail & Related papers (2025-11-04T09:47:09Z) - Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective [45.210030086193775]
Current synthetic image detection (SID) pipelines are primarily dedicated to crafting universal artifact features.<n>We propose SAFE, a lightweight and effective detector with three simple image transformations.<n>Our pipeline achieves a new state-of-the-art performance, with remarkable improvements of 4.5% in accuracy and 2.9% in average precision against existing methods.
arXiv Detail & Related papers (2024-08-13T09:01:12Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Generating Diverse Agricultural Data for Vision-Based Farming Applications [74.79409721178489]
This model is capable of simulating distinct growth stages of plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions.
Our dataset includes 12,000 images with semantic labels, offering a comprehensive resource for computer vision tasks in precision agriculture.
arXiv Detail & Related papers (2024-03-27T08:42:47Z) - Exposure Bracketing Is All You Need For A High-Quality Image [50.822601495422916]
Multi-exposure images are complementary in denoising, deblurring, high dynamic range imaging, and super-resolution.<n>We propose to utilize exposure bracketing photography to get a high-quality image by combining these tasks in this work.<n>In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - Perceptual Artifacts Localization for Image Synthesis Tasks [59.638307505334076]
We introduce a novel dataset comprising 10,168 generated images, each annotated with per-pixel perceptual artifact labels.
A segmentation model, trained on our proposed dataset, effectively localizes artifacts across a range of tasks.
We propose an innovative zoom-in inpainting pipeline that seamlessly rectifies perceptual artifacts in the generated images.
arXiv Detail & Related papers (2023-10-09T10:22:08Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - Less is More: Unsupervised Mask-guided Annotated CT Image Synthesis with
Minimum Manual Segmentations [2.1785903900600316]
We propose a novel strategy for medical image synthesis, namely Unsupervised Mask (UM)-guided synthesis.
UM-guided synthesis provided high-quality synthetic images with significantly higher fidelity, variety, and utility.
arXiv Detail & Related papers (2023-03-19T20:30:35Z) - SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ
Histopathology Image Synthesis [63.845552349914186]
We propose a style-guided instance-adaptive normalization (SIAN) to synthesize realistic color distributions and textures for different organs.
The four phases work together and are integrated into a generative network to embed image semantics, style, and instance-level boundaries.
arXiv Detail & Related papers (2022-09-02T16:45:46Z) - Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision
Farming [3.4788711710826083]
We propose an alternative solution with respect to the common data augmentation methods, applying it to the problem of crop/weed segmentation in precision farming.
We create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts.
In addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images.
arXiv Detail & Related papers (2020-09-12T08:49:36Z) - CNN Detection of GAN-Generated Face Images based on Cross-Band
Co-occurrences Analysis [34.41021278275805]
Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones.
We propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands.
arXiv Detail & Related papers (2020-07-25T10:55:04Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.