From Field to Drone: Domain Drift Tolerant Automated Multi-Species and Damage Plant Semantic Segmentation for Herbicide Trials
- URL: http://arxiv.org/abs/2508.07514v1
- Date: Mon, 11 Aug 2025 00:08:42 GMT
- Title: From Field to Drone: Domain Drift Tolerant Automated Multi-Species and Damage Plant Semantic Segmentation for Herbicide Trials
- Authors: Artzai Picon, Itziar Eguskiza, Daniel Mugica, Javier Romero, Carlos Javier Jimenez, Eric White, Gabriel Do-Lago-Junqueira, Christian Klukas, Ramon Navarra-Mestre,
- Abstract summary: We present a general-purpose self-supervised visual model with hierarchical inference based on botanical taxonomy.<n>The model significantly improved species identification (F1-score: 0.52 to 0.85, R-squared: 0.75 to 0.98) and damage classification (F1-score: 0.28 to 0.44, R-squared: 0.71 to 0.87) over prior models.<n>It is now deployed in BASF's phenotyping pipeline, enabling large-scale, automated crop and weed monitoring across diverse geographies.
- Score: 1.0483690290582848
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Field trials are vital in herbicide research and development to assess effects on crops and weeds under varied conditions. Traditionally, evaluations rely on manual visual assessments, which are time-consuming, labor-intensive, and subjective. Automating species and damage identification is challenging due to subtle visual differences, but it can greatly enhance efficiency and consistency. We present an improved segmentation model combining a general-purpose self-supervised visual model with hierarchical inference based on botanical taxonomy. Trained on a multi-year dataset (2018-2020) from Germany and Spain using digital and mobile cameras, the model was tested on digital camera data (year 2023) and drone imagery from the United States, Germany, and Spain (year 2024) to evaluate robustness under domain shift. This cross-device evaluation marks a key step in assessing generalization across platforms of the model. Our model significantly improved species identification (F1-score: 0.52 to 0.85, R-squared: 0.75 to 0.98) and damage classification (F1-score: 0.28 to 0.44, R-squared: 0.71 to 0.87) over prior methods. Under domain shift (drone images), it maintained strong performance with moderate degradation (species: F1-score 0.60, R-squared 0.80; damage: F1-score 0.41, R-squared 0.62), where earlier models failed. These results confirm the model's robustness and real-world applicability. It is now deployed in BASF's phenotyping pipeline, enabling large-scale, automated crop and weed monitoring across diverse geographies.
Related papers
- Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment [1.8430060563461854]
Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage.<n>In this work, we adapt a general-purpose vision foundation model to herbicide trial characterization.<n> Trained using a self-supervised learning approach on a large, curated agricultural dataset, the model learns rich and transferable representations optimized for herbicide trials images.
arXiv Detail & Related papers (2025-11-06T11:30:32Z) - MSRANetV2: An Explainable Deep Learning Architecture for Multi-class Classification of Colorectal Histopathological Images [3.4859776888706233]
Colorectal cancer (CRC) is a leading worldwide cause of cancer-related mortality.<n>Deep learning algorithms have become a powerful approach in enhancing diagnostic precision and efficiency.<n>We propose a convolutional neural network architecture named MSRANetV2, specially optimized for the classification of colorectal tissue images.
arXiv Detail & Related papers (2025-10-28T07:22:34Z) - Domain-Robust Marine Plastic Detection Using Vision Models [0.0]
This study benchmarks models for cross-domain robustness, training convolutional neural networks and vision transformers.<n>Two zero-shot models were assessed, CLIP ViT-L14 and Google's Gemini 2.0 Flash, that leverage pretraining to classify images without fine-tuning.<n>Results show the lightweight MobileNetV2 delivers the strongest cross-domain performance (F1 0.97), surpassing larger models.
arXiv Detail & Related papers (2025-09-29T17:15:07Z) - Weed Detection in Challenging Field Conditions: A Semi-Supervised Framework for Overcoming Shadow Bias and Data Scarcity [7.019137213828947]
This study tackles both issues through a diagnostic-driven, semi-supervised framework.<n>We use a unique dataset of approximately 975 labeled and 10,000 unlabeled images of Guinea Grass in sugarcane.<n>Our work provides a clear and field-tested framework for developing, diagnosing, and improving robust computer vision systems.
arXiv Detail & Related papers (2025-08-27T01:55:47Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - WorldPM: Scaling Human Preference Modeling [130.23230492612214]
We propose World Preference Modeling$ (WorldPM) to emphasize this scaling potential.<n>We collect preference data from public forums covering diverse user communities.<n>We conduct extensive training using 15M-scale data across models ranging from 1.5B to 72B parameters.
arXiv Detail & Related papers (2025-05-15T17:38:37Z) - scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction [5.058966163642362]
scDrugMap is an integrated framework featuring both a Python command-line interface and a web server for drug response prediction.<n> scDrugMap evaluates a wide range of foundation models, including eight single-cell models and two large language models.<n> scDrugMap provides the first large-scale benchmark of foundation models for drug response prediction in single-cell data.
arXiv Detail & Related papers (2025-05-08T19:46:19Z) - Enhancing Leaf Disease Classification Using GAT-GCN Hybrid Model [0.23301643766310373]
This research presents a hybrid model combining Graph Attention Networks (GATs) and Graph Convolution Networks (GCNs) for leaf disease classification.<n>GCNs have been widely used for learning from graph-structured data, and GATs enhance this by incorporating attention mechanisms to focus on the most important neighbors.<n>The edge augmentation technique has introduced a significant degree of generalization in the detection capabilities of the model.
arXiv Detail & Related papers (2025-04-07T06:31:38Z) - Segment-and-Classify: ROI-Guided Generalizable Contrast Phase Classification in CT Using XGBoost [7.689389068258514]
This study utilized three public CT datasets from separate institutions.<n>The phase prediction model was trained on the WAW-TACE dataset and validated on the VinDr-Multiphase and C4KC-KiTS datasets.
arXiv Detail & Related papers (2025-01-23T20:01:33Z) - Patch-Based and Non-Patch-Based inputs Comparison into Deep Neural Models: Application for the Segmentation of Retinal Diseases on Optical Coherence Tomography Volumes [0.3749861135832073]
Approaching 170 million persons wide-ranging have been spotted with AMD, a figure anticipated to rise to 288 million by 2040.<n>Deep learning networks have shown promising results in both image and pixel-level 2D scan classification.<n>Highest score for a patch-based model in the DSC metric was 0.88 in comparison to the score of 0.71 for the same model in non-patch-based for SRF fluid segmentation.
arXiv Detail & Related papers (2025-01-22T10:22:08Z) - Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens [53.99177152562075]
Scaling up autoregressive models in vision has not proven as beneficial as in large language models.
We focus on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed order using BERT- or GPT-like transformer architectures.
Our results show that while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends.
arXiv Detail & Related papers (2024-10-17T17:59:59Z) - GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models.
We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench.
GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows
from UAV Imagery [56.10033255997329]
We propose a novel deep learning method based on a Convolutional Neural Network (CNN)
It simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations.
The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
arXiv Detail & Related papers (2020-12-31T18:51:17Z) - Deep Learning Frameworks for Pavement Distress Classification: A
Comparative Analysis [2.752817022620644]
This study deploys state-of-the-art deep learning algorithms to detect and characterize pavement distresses.
The models were trained using 21,041 images captured across urban and rural streets of Japan, Czech Republic and India.
The best performing model achieved an F1 score of 0.58 and 0.57 on two test datasets released by the IEEE Global Road Damage Detection Challenge.
arXiv Detail & Related papers (2020-10-21T00:26:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.