Related papers: Adapting SAM with Dynamic Similarity Graphs for Few-Shot Parameter-Efficient Small Dense Object Detection: A Case Study of Chickpea Pods in Field Conditions

Adapting SAM with Dynamic Similarity Graphs for Few-Shot Parameter-Efficient Small Dense Object Detection: A Case Study of Chickpea Pods in Field Conditions

URL: http://arxiv.org/abs/2509.25805v1
Date: Tue, 30 Sep 2025 05:26:06 GMT
Title: Adapting SAM with Dynamic Similarity Graphs for Few-Shot Parameter-Efficient Small Dense Object Detection: A Case Study of Chickpea Pods in Field Conditions
Authors: Xintong Jiang, Yixue Liu, Mohamed Debbagh, Yu Tian, Valerio Hoyos-Villegas, Viacheslav Adamchuk, Shangpeng Sun,
Abstract summary: This study introduces a Dynamic Similarity-based Graph Adaptation (DSGA) module to adapt the Segment Anything Model (SAM)<n>DSGA establishes robust spatial and dynamic similarity representation with only 4.00M trainable parameters, which is 4.26% of the original SAM.<n>The proposed adaptation demonstrated practical utility for automated agricultural monitoring applications, achieving accurate pod-counting with an adjusted R-squared of 0.8987.
Score: 7.500556611536649
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parameter-Efficient Fine-Tuning (PEFT) of foundation models for agricultural computer vision tasks remains challenging due to limited training data and complex field conditions. This study introduces a Dynamic Similarity-based Graph Adaptation (DSGA) module to adapt the Segment Anything Model (SAM) under extreme data constraints for precise foreground and instance segmentation of small dense objects in complex agricultural environments. Through dynamic similarity graph construction with a learnable polynomial decay-initialized weight ranking mechanism and adaptive local feature aggregation, DSGA establishes robust spatial and dynamic similarity representation with only 4.00M trainable parameters, which is 4.26% of the original SAM. Integrating this graph-based feature adaptation with Low-Rank Adaptation (LoRA) creates a complementary optimization framework that effectively captures both local and global dependencies in image embeddings while preserving model stability and parameter efficiency. Experimental results on a challenging chickpea pod dataset demonstrated that DSGA with LoRA achieved superior performance across multiple metrics evaluated under 2, 4, 8 and 10 shots, with progressive performance gains as shot count increased. Quantitative metrics showed a 17.31% improvement in Structure-measure and a 62.36% gain in adaptive F-measure compared to the baseline SAM fine-tuning. Comprehensive ablation studies and visualization analyses through Grad-CAM and t-SNE validated the framework's effectiveness in feature discrimination. The proposed adaptation demonstrated practical utility for automated agricultural monitoring applications, achieving accurate pod-counting with an adjusted R-squared of 0.8987 for images with 10 to 120 pods under challenging field conditions.

Related papers

Graph Laplacian Transformer with Progressive Sampling for Prostate Cancer Grading [2.9485900021889146]
We propose a Graph Laplacian Attention-Based Transformer (GLAT) integrated with an Iterative Refinement Module (IRM) to enhance both feature learning and spatial consistency.<n>IRM iteratively refines patch selection by leveraging a pretrained ResNet50 for local feature extraction and a foundation model in no-gradient mode for importance scoring.<n>The GLAT models tissue-level connectivity by constructing a graph where patches serve as nodes, ensuring spatial consistency through graph Laplacian constraints.
arXiv Detail & Related papers (2025-12-11T16:55:57Z)
Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation [109.13471554184554]
We reformulate dataset distillation as an Optimal Transport (OT) distance minimization problem.<n>OT offers a geometrically faithful framework for distribution matching.<n>Our method consistently outperforms state-of-the-art approaches in an efficient manner.
arXiv Detail & Related papers (2025-11-29T04:04:05Z)
Equivariant-Aware Structured Pruning for Efficient Edge Deployment: A Comprehensive Framework with Adaptive Fine-Tuning [0.0]
We present a framework combining group equivariant convolutional neural networks (G-CNNs) with equivariant-aware structured pruning.<n>Our approach preserves equivariant properties by analyzing e2cnn layer structure and applying neuron-level pruning to fully connected components.<n>We evaluate our method on satellite imagery (EuroSAT) and standard benchmarks (CIFAR-10, Rotated MNIST) demonstrating effectiveness across diverse domains.
arXiv Detail & Related papers (2025-11-21T13:41:47Z)
A Method for Identifying Farmland System Habitat Types Based on the Dynamic-Weighted Feature Fusion Network Model [0.0]
This study developed an annotated ultra-high-resolution remote sensing image dataset encompassing 15 categories of cultivated land system habitats.<n>We propose a Dynamic-Weighted Feature Fusion Network (DWFF-Net) to extract foundational features.<n>The proposed model achieves a mean Intersection over Union (mIoU) of 0.6979 and an F1-score of 0.8049, outperforming the baseline network by 0.021 and 0.0161, respectively.
arXiv Detail & Related papers (2025-11-11T02:44:38Z)
Dual Atrous Separable Convolution for Improving Agricultural Semantic Segmentation [2.3636539018632616]
This study proposes an efficient image segmentation method for precision agriculture.<n>A novel Dual Atrous Separable Convolution (DAS Conv) module is integrated within the DeepLabV3-based segmentation framework.<n>It achieves more than 66% improvement in efficiency when considering the trade-off between model complexity and performance.
arXiv Detail & Related papers (2025-06-27T18:37:43Z)
High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z)
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning.<n>Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning.<n>We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
HeteroTune: Efficient Federated Learning for Large Heterogeneous Models [35.53420882449293]
We propose HeteroTune, a novel federated fine-tuning paradigm for large, heterogeneous models operating under limited communication and budgets.<n>The core of our method lies in a novel architecture, DeMA, which enables flexible and efficient aggregation of heterogeneous models.<n>We provide both theoretical analysis and empirical evidence showing that HeteroTune achieves state-of-the-art performance and efficiency across diverse tasks and model architectures.
arXiv Detail & Related papers (2024-11-25T09:58:51Z)
Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations [1.5723316845301678]
This report introduces a novel methodology for training with augmentations to enhance model robustness and performance in such conditions. We present a comprehensive framework that includes identifying weak spots in Machine Learning models, selecting suitable augmentations, and devising effective training strategies. Experimental results demonstrate improvements in model performance, as measured by commonly used metrics such as mean Average Precision (mAP) and mean Intersection over Union (mIoU) on open-source object detection and semantic segmentation models and datasets.
arXiv Detail & Related papers (2024-08-30T14:15:48Z)
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes [12.36950265154199]
We introduce a novel Multi-scale Contrastive Adaptor learning method named MCA-SAM. MCA-SAM enhances adaptor performance through a meticulously designed contrastive learning framework at both token and sample levels. Empirical results demonstrate that MCA-SAM sets new benchmarks, outperforming existing methods in three challenging domains.
arXiv Detail & Related papers (2024-08-12T06:23:10Z)
GroupMamba: Efficient Group-Based Visual State Space Model [66.35608254724566]
State-space models (SSMs) have recently shown promise in capturing long-range dependencies with subquadratic computational complexity.<n>However, purely SSM-based models face critical challenges related to stability and achieving state-of-the-art performance in computer vision tasks.<n>Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection [86.69077525494106]
Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models. Existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift. We propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains.
arXiv Detail & Related papers (2020-03-19T13:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.