Related papers: Few-Shot Adaptation of Grounding DINO for Agricultural Domain

Few-Shot Adaptation of Grounding DINO for Agricultural Domain

URL: http://arxiv.org/abs/2504.07252v1
Date: Wed, 09 Apr 2025 19:57:25 GMT
Title: Few-Shot Adaptation of Grounding DINO for Agricultural Domain
Authors: Rajhans Singh, Rafael Bidese Puhl, Kshitiz Dhakal, Sudhir Sornapudi,
Abstract summary: Open-set object detection models like Grounding-DINO offer a potential solution to detect regions of interests based on text prompt input.<n>We propose an efficient few-shot adaptation method that simplifies the Grounding-DINO architecture by removing the text encoder module.<n>This method achieves superior performance across multiple agricultural datasets, including plant-weed detection, plant counting, insect identification, fruit counting, and remote sensing tasks.
Score: 0.29998889086656577
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning models are transforming agricultural applications by enabling automated phenotyping, monitoring, and yield estimation. However, their effectiveness heavily depends on large amounts of annotated training data, which can be labor and time intensive. Recent advances in open-set object detection, particularly with models like Grounding-DINO, offer a potential solution to detect regions of interests based on text prompt input. Initial zero-shot experiments revealed challenges in crafting effective text prompts, especially for complex objects like individual leaves and visually similar classes. To address these limitations, we propose an efficient few-shot adaptation method that simplifies the Grounding-DINO architecture by removing the text encoder module (BERT) and introducing a randomly initialized trainable text embedding. This method achieves superior performance across multiple agricultural datasets, including plant-weed detection, plant counting, insect identification, fruit counting, and remote sensing tasks. Specifically, it demonstrates up to a $\sim24\%$ higher mAP than fully fine-tuned YOLO models on agricultural datasets and outperforms previous state-of-the-art methods by $\sim10\%$ in remote sensing, under few-shot learning conditions. Our method offers a promising solution for automating annotation and accelerating the development of specialized agricultural AI solutions.

Related papers

ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search [53.40810298627443]
ReGUIDE is a framework for web grounding that enables MLLMs to learn data efficiently through self-generated reasoning and spatial-aware criticism.<n>Our experiments demonstrate that ReGUIDE significantly advances web grounding performance across multiple benchmarks.
arXiv Detail & Related papers (2025-05-21T08:36:18Z)
WeedsGalore: A Multispectral and Multitemporal UAV-based Dataset for Crop and Weed Segmentation in Agricultural Maize Fields [0.7421845364041001]
Weeds are one of the major reasons for crop yield loss but current weeding practices fail to manage weeds in an efficient and targeted manner.<n>We present a novel dataset for semantic and instance segmentation of crops and weeds in agricultural maize fields.
arXiv Detail & Related papers (2025-02-18T18:13:19Z)
Edge-AI for Agriculture: Lightweight Vision Models for Disease Detection in Resource-Limited Settings [0.0]
The proposed system integrates advanced object detection, classification, and segmentation models, optimized for deployment on edge devices.<n>The study evaluates the performance of various state-of-the-art models, focusing on their accuracy, computational efficiency, and generalization capabilities.
arXiv Detail & Related papers (2024-12-23T06:48:50Z)
Explainable AI in Grassland Monitoring: Enhancing Model Performance and Domain Adaptability [0.6131022957085438]
Grasslands are known for their high biodiversity and ability to provide multiple ecosystem services. Challenges in automating the identification of indicator plants are key obstacles to large-scale grassland monitoring. This paper delves into the latter two challenges, with a specific focus on transfer learning and XAI approaches to grassland monitoring.
arXiv Detail & Related papers (2023-12-13T10:17:48Z)
Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data [5.573704309892796]
We explore using domain-agnostic general pre-trained large language model(LLM) to extract structured data from agricultural documents with minimal or no human intervention. In comparison to existing methods, our approach achieves consistently better accuracy in the benchmark while maintaining efficiency.
arXiv Detail & Related papers (2023-08-06T13:18:38Z)
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation [42.39035033967183]
Service robots need a real-time perception system that understands their surroundings and identifies their targets in the wild. Existing methods, however, often fall short in generalizing to new crops and environmental conditions. We propose a novel approach to enhance domain generalization using knowledge distillation.
arXiv Detail & Related papers (2023-04-03T14:28:29Z)
A Novel Dataset for Evaluating and Alleviating Domain Shift for Human Detection in Agricultural Fields [59.035813796601055]
We evaluate the impact of domain shift on human detection models trained on well known object detection datasets when deployed on data outside the distribution of the training set. We introduce the OpenDR Humans in Field dataset, collected in the context of agricultural robotics applications, using the Robotti platform.
arXiv Detail & Related papers (2022-09-27T07:04:28Z)
Generative models-based data labeling for deep networks regression: application to seed maturity estimation from UAV multispectral images [3.6868861317674524]
Monitoring seed maturity is an increasing challenge in agriculture due to climate change and more restrictive practices. Traditional methods are based on limited sampling in the field and analysis in laboratory. We propose a method for estimating parsley seed maturity using multispectral UAV imagery, with a new approach for automatic data labeling.
arXiv Detail & Related papers (2022-08-09T09:06:51Z)
Learning Feature Decomposition for Domain Adaptive Monocular Depth Estimation [51.15061013818216]
Supervised approaches have led to great success with the advance of deep learning, but they rely on large quantities of ground-truth depth annotations. Unsupervised domain adaptation (UDA) transfers knowledge from labeled source data to unlabeled target data, so as to relax the constraint of supervised learning. We propose a novel UDA method for MDE, referred to as Learning Feature Decomposition for Adaptation (LFDA), which learns to decompose the feature space into content and style components.
arXiv Detail & Related papers (2022-07-30T08:05:35Z)
Unsupervised Domain Adaptive Learning via Synthetic Data for Person Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance. Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models. In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only. We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation [55.34995029082051]
We propose a method to learn to augment for data-scarce domain BERT knowledge distillation. We show that the proposed method significantly outperforms state-of-the-art baselines on four different tasks.
arXiv Detail & Related papers (2021-01-20T13:07:39Z)
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences. Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.