Related papers: Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment

Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment

URL: http://arxiv.org/abs/2511.04288v1
Date: Thu, 06 Nov 2025 11:30:32 GMT
Title: Vision Foundation Models in Agriculture: Toward Domain-Specific Adaptation for Weed Herbicide Trials Assessment
Authors: Leire Benito-Del-Valle, Artzai Picón, Daniel Mugica, Manuel Ramos, Eva Portillo, Javier Romero, Carlos Javier Jimenez, Ramón Navarra-Mestre,
Abstract summary: Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage.<n>In this work, we adapt a general-purpose vision foundation model to herbicide trial characterization.<n> Trained using a self-supervised learning approach on a large, curated agricultural dataset, the model learns rich and transferable representations optimized for herbicide trials images.
Score: 1.8430060563461854
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage across diverse environments. While general-purpose vision foundation models have shown promising results in complex visual domains, their performance can be limited in agriculture, where fine-grained distinctions between species and damage types are critical. In this work, we adapt a general-purpose vision foundation model to herbicide trial characterization. Trained using a self-supervised learning approach on a large, curated agricultural dataset, the model learns rich and transferable representations optimized for herbicide trials images. Our domain-specific model significantly outperforms the best general-purpose foundation model in both species identification (F1 score improvement from 0.91 to 0.94) and damage classification (from 0.26 to 0.33). Under unseen conditions (new locations and other time), it achieves even greater gains (species identification from 0.56 to 0.66; damage classification from 0.17 to 0.27). In domain-shift scenarios, such as drone imagery, it maintains strong performance (species classification from 0.49 to 0.60). Additionally, we show that domain-specific pretraining enhances segmentation accuracy, particularly in low-annotation regimes. An annotation-efficiency analysis reveals that, under unseen conditions, the domain-specific model achieves 5.4% higher F1 score than the general-purpose model, while using 80% fewer labeled samples. These results demonstrate the generalization capabilities of domain-specific foundation models and their potential to significantly reduce manual annotation efforts, offering a scalable and automated solution for herbicide trial analysis.

Related papers

Handling imbalance and few-sample size in ML based Onion disease classification [1.3177681589844814]
We propose a robust deep learning based model for multi-class classification of onion crop diseases and pests.<n>We give a model which gives 96.90% overall accuracy and 0.96 F1 score on real-world field image dataset.
arXiv Detail & Related papers (2025-09-01T19:05:39Z)
Weed Detection in Challenging Field Conditions: A Semi-Supervised Framework for Overcoming Shadow Bias and Data Scarcity [7.019137213828947]
This study tackles both issues through a diagnostic-driven, semi-supervised framework.<n>We use a unique dataset of approximately 975 labeled and 10,000 unlabeled images of Guinea Grass in sugarcane.<n>Our work provides a clear and field-tested framework for developing, diagnosing, and improving robust computer vision systems.
arXiv Detail & Related papers (2025-08-27T01:55:47Z)
From Field to Drone: Domain Drift Tolerant Automated Multi-Species and Damage Plant Semantic Segmentation for Herbicide Trials [1.0483690290582848]
We present a general-purpose self-supervised visual model with hierarchical inference based on botanical taxonomy.<n>The model significantly improved species identification (F1-score: 0.52 to 0.85, R-squared: 0.75 to 0.98) and damage classification (F1-score: 0.28 to 0.44, R-squared: 0.71 to 0.87) over prior models.<n>It is now deployed in BASF's phenotyping pipeline, enabling large-scale, automated crop and weed monitoring across diverse geographies.
arXiv Detail & Related papers (2025-08-11T00:08:42Z)
Intrinsic Explainability of Multimodal Learning for Crop Yield Prediction [36.766406330345525]
We leverage the intrinsic explainability of Transformer-based models to explain multimodal learning networks.<n>This study focuses on the task of crop yield prediction at the subfield level.
arXiv Detail & Related papers (2025-08-09T11:09:10Z)
PlantSAM: An Object Detection-Driven Segmentation Pipeline for Herbarium Specimens [0.5339846068056558]
We introduce PlantSAM, an automated segmentation pipeline that integrates YOLOv10 for plant region detection and the Segment Anything Model (SAM2) for segmentation.<n>YOLOv10 generates bounding box prompts to guide SAM2, enhancing segmentation accuracy.<n>PlantSAM achieved state-of-the-art segmentation performance, with an IoU of 0.94 and a Dice coefficient of 0.97.
arXiv Detail & Related papers (2025-07-22T12:02:39Z)
BeetleVerse: A Study on Taxonomic Classification of Ground Beetles [0.310688583550805]
Ground beetles are a highly sensitive and speciose biological indicator, making them vital for monitoring biodiversity.<n>In this paper, we evaluate 12 vision models on taxonomic classification across four diverse, long-tailed datasets.<n>Our results show that the Vision and Language Transformer combined with an head is the best performing model, with 97% accuracy at genus and species level.
arXiv Detail & Related papers (2025-04-18T01:06:37Z)
Phikon-v2, A large and public feature extractor for biomarker prediction [42.52549987351643]
We train a vision transformer using DINOv2 and publicly release one iteration of this model for further experimentation, coined Phikon-v2. While trained on publicly available histology slides, Phikon-v2 surpasses our previously released model (Phikon) and performs on par with other histopathology foundation models (FM) trained on proprietary data.
arXiv Detail & Related papers (2024-09-13T20:12:29Z)
BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions [28.79416825695514]
Agricultural production is facing severe challenges in the next decades induced by climate change and the need for sustainability.<n> Advancements in field management through non-chemical weeding by robots in combination with monitoring of crops by autonomous unmanned aerial vehicles (UAVs) are helpful to address these challenges.<n>The analysis of plant traits, called phenotyping, is an essential activity in plant breeding, it however involves a great amount of manual labor.
arXiv Detail & Related papers (2023-12-22T14:06:44Z)
On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation [47.95611203419802]
Foundations for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. We compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset. We further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution data.
arXiv Detail & Related papers (2023-11-18T14:52:10Z)
Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition [30.327642724046903]
Species196 is a large-scale semi-supervised dataset of 196-category invasive species. It collects over 19K images with expert-level accurate annotations Species196-L, and 1.2M unlabeled images of invasive species Species196-U.
arXiv Detail & Related papers (2023-09-25T14:46:01Z)
End-to-end deep learning for directly estimating grape yield from ground-based imagery [53.086864957064876]
This study demonstrates the application of proximal imaging combined with deep learning for yield estimation in vineyards. Three model architectures were tested: object detection, CNN regression, and transformer models. The study showed the applicability of proximal imaging and deep learning for prediction of grapevine yield on a large scale.
arXiv Detail & Related papers (2022-08-04T01:34:46Z)
Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species. A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.