Related papers: AgroBench: Vision-Language Model Benchmark in Agriculture

AgroBench: Vision-Language Model Benchmark in Agriculture

URL: http://arxiv.org/abs/2507.20519v1
Date: Mon, 28 Jul 2025 04:58:29 GMT
Title: AgroBench: Vision-Language Model Benchmark in Agriculture
Authors: Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka, Masaki Onishi, Yoshitaka Ushiku,
Abstract summary: We introduce AgroBench, a benchmark for evaluating vision-language models (VLMs) across seven agricultural topics.<n>Our AgroBench covers a state-of-the-art range of categories, including 203 crop categories and 682 disease categories, to thoroughly evaluate VLM capabilities.
Score: 25.52955831089068
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Precise automated understanding of agricultural tasks such as disease identification is essential for sustainable crop production. Recent advances in vision-language models (VLMs) are expected to further expand the range of agricultural tasks by facilitating human-model interaction through easy, text-based communication. Here, we introduce AgroBench (Agronomist AI Benchmark), a benchmark for evaluating VLM models across seven agricultural topics, covering key areas in agricultural engineering and relevant to real-world farming. Unlike recent agricultural VLM benchmarks, AgroBench is annotated by expert agronomists. Our AgroBench covers a state-of-the-art range of categories, including 203 crop categories and 682 disease categories, to thoroughly evaluate VLM capabilities. In our evaluation on AgroBench, we reveal that VLMs have room for improvement in fine-grained identification tasks. Notably, in weed identification, most open-source VLMs perform close to random. With our wide range of topics and expert-annotated categories, we analyze the types of errors made by VLMs and suggest potential pathways for future VLM development. Our dataset and code are available at https://dahlian00.github.io/AgroBenchPage/ .

Related papers

AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock [77.95897723270453]
Crops, fisheries and livestock form the backbone of global food production, essential to feed the ever-growing global population.<n> Addressing these issues requires efficient, accurate, and scalable technological solutions, highlighting the importance of artificial intelligence (AI)<n>This survey presents a systematic and thorough review of more than 200 research works covering conventional machine learning approaches, advanced deep learning techniques, and recent vision-language foundation models.
arXiv Detail & Related papers (2025-07-29T17:59:48Z)
AgriEval: A Comprehensive Chinese Agricultural Benchmark for Large Language Models [19.265932725554833]
We propose AgriEval, the first comprehensive Chinese agricultural benchmark with three main characteristics.<n>AgriEval covers six major agriculture categories and 29 subcategories within agriculture, addressing four core cognitive scenarios.<n>AgriEval comprises 14,697 multiple-choice questions and 2,167 open-ended question-and-answer questions, establishing it as the most extensive agricultural benchmark available to date.
arXiv Detail & Related papers (2025-07-29T12:58:27Z)
Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind [16.96145027280737]
We introduce AgroMind, a benchmark for agricultural remote sensing (RS)<n>AgroMind covers four task dimensions: spatial perception, object understanding, scene understanding, and scene reasoning.<n>We evaluate 18 open-source LMMs and 3 closed-source models on AgroMind.
arXiv Detail & Related papers (2025-05-18T02:45:19Z)
Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases [49.782064512495495]
We construct the first multimodal instruction-following dataset in the agricultural domain.<n>This dataset covers over 221 types of pests and diseases with approximately 400,000 data entries.<n>We propose a knowledge-infused training method to develop Agri-LLaVA, an agricultural multimodal conversation system.
arXiv Detail & Related papers (2024-12-03T04:34:23Z)
AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning [30.034193330398292]
We propose an approach to construct instruction-tuning data that harnesses vision-only data for the agriculture domain.<n>We utilize diverse agricultural datasets spanning multiple domains, curate class-specific information, and employ large language models (LLMs) to construct an expert-tuning set.<n>We expert-tuned and created AgroGPT, an efficient LMM that can hold complex agriculture-related conversations and provide useful insights.
arXiv Detail & Related papers (2024-10-10T22:38:26Z)
Leveraging Vision Language Models for Specialized Agricultural Tasks [19.7240633020344]
We present AgEval, a benchmark for assessing Vision Language Models' capabilities in plant stress phenotyping.<n>Our study explores how general-purpose VLMs can be leveraged for domain-specific tasks with only a few annotated examples.<n>Our results demonstrate VLMs' rapid adaptability to specialized tasks, with the best-performing model showing an increase in F1 scores from 46.24% to 73.37% in 8-shot identification.
arXiv Detail & Related papers (2024-07-29T00:39:51Z)
Generating Diverse Agricultural Data for Vision-Based Farming Applications [74.79409721178489]
This model is capable of simulating distinct growth stages of plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions. Our dataset includes 12,000 images with semantic labels, offering a comprehensive resource for computer vision tasks in precision agriculture.
arXiv Detail & Related papers (2024-03-27T08:42:47Z)
AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks [62.05741061393927]
We envision that the two fields can radically push the boundaries of each other through tight integration. By highlighting conceivable synergies, but also risks, we aim to foster further exploration at the intersection of AutoML and LLMs.
arXiv Detail & Related papers (2023-06-13T19:51:22Z)
PhenoBench -- A Large Dataset and Benchmarks for Semantic Image Interpretation in the Agricultural Domain [29.395926321984565]
We present an annotated dataset and benchmarks for the semantic interpretation of real agricultural fields. Our dataset recorded with a UAV provides high-quality, pixel-wise annotations of crops and weeds, but also crop leaf instances at the same time. We provide benchmarks for various tasks on a hidden test set comprised of different fields.
arXiv Detail & Related papers (2023-06-07T16:04:08Z)
Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities [86.89427012495457]
We review how AI techniques can transform agrifood systems and contribute to the modern agrifood industry. We present a progress review of AI methods in agrifood systems, specifically in agriculture, animal husbandry, and fishery. We highlight potential challenges and promising research opportunities for transforming modern agrifood systems with AI.
arXiv Detail & Related papers (2023-05-03T05:16:54Z)
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation [42.39035033967183]
Service robots need a real-time perception system that understands their surroundings and identifies their targets in the wild. Existing methods, however, often fall short in generalizing to new crops and environmental conditions. We propose a novel approach to enhance domain generalization using knowledge distillation.
arXiv Detail & Related papers (2023-04-03T14:28:29Z)
Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis [110.30849704592592]
We present Agriculture-Vision: a large-scale aerial farmland image dataset for semantic segmentation of agricultural patterns. Each image consists of RGB and Near-infrared (NIR) channels with resolution as high as 10 cm per pixel. We annotate nine types of field anomaly patterns that are most important to farmers.
arXiv Detail & Related papers (2020-01-05T20:19:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.