DAS-SK: An Adaptive Model Integrating Dual Atrous Separable and Selective Kernel CNN for Agriculture Semantic Segmentation
- URL: http://arxiv.org/abs/2602.08168v1
- Date: Mon, 09 Feb 2026 00:14:39 GMT
- Title: DAS-SK: An Adaptive Model Integrating Dual Atrous Separable and Selective Kernel CNN for Agriculture Semantic Segmentation
- Authors: Mei Ling Chee, Thangarajah Akilan, Aparna Ravindra Phalke, Kanchan Keisham,
- Abstract summary: DAS-SK is a novel lightweight architecture that retrofits selective kernel convolution (SK-Conv) into the dual atrous separable convolution (DAS-Conv) module.<n>DAS-SK consistently achieves state-of-the-art performance, while being more efficient than CNN-, transformer-, and hybrid-based competitors.
- Score: 0.9204659134755793
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Semantic segmentation in high-resolution agricultural imagery demands models that strike a careful balance between accuracy and computational efficiency to enable deployment in practical systems. In this work, we propose DAS-SK, a novel lightweight architecture that retrofits selective kernel convolution (SK-Conv) into the dual atrous separable convolution (DAS-Conv) module to strengthen multi-scale feature learning. The model further enhances the atrous spatial pyramid pooling (ASPP) module, enabling the capture of fine-grained local structures alongside global contextual information. Built upon a modified DeepLabV3 framework with two complementary backbones - MobileNetV3-Large and EfficientNet-B3, the DAS-SK model mitigates limitations associated with large dataset requirements, limited spectral generalization, and the high computational cost that typically restricts deployment on UAVs and other edge devices. Comprehensive experiments across three benchmarks: LandCover.ai, VDD, and PhenoBench, demonstrate that DAS-SK consistently achieves state-of-the-art performance, while being more efficient than CNN-, transformer-, and hybrid-based competitors. Notably, DAS-SK requires up to 21x fewer parameters and 19x fewer GFLOPs than top-performing transformer models. These findings establish DAS-SK as a robust, efficient, and scalable solution for real-time agricultural robotics and high-resolution remote sensing, with strong potential for broader deployment in other vision domains.
Related papers
- Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries [51.028432812178266]
Transolver-3 is a new member of the Transolver family designed for high-fidelity physics simulations.<n>We show that Transolver-3 is capable of handling meshes with over 160 million cells, achieving impressive performance across three challenging simulation benchmarks.
arXiv Detail & Related papers (2026-02-04T16:52:44Z) - SC-Net: Robust Correspondence Learning via Spatial and Cross-Channel Context [19.20797236825297]
Recent research has focused on using convolutional neural networks (CNNs) as the backbones in correspondence learning.<n>We propose a novel network named SC-Net, which effectively integrates bilateral context from both spatial and channel perspectives.<n>In experiments, SC-Net outperforms state-of-the-art methods in relative pose estimation and outlier removal tasks.
arXiv Detail & Related papers (2025-12-29T13:56:10Z) - Adaptive Mesh-Quantization for Neural PDE Solvers [51.26961483962011]
Graph Neural Networks can handle the irregular meshes required for complex geometries and boundary conditions, but still apply uniform computational effort across all nodes.<n>We propose Adaptive Mesh Quantization: spatially adaptive quantization across mesh node, edge, and cluster features, dynamically adjusting the bit-width used by a quantized model.<n>We demonstrate our framework's effectiveness by integrating it with two state-of-the-art models, MP-PDE and GraphViT, to evaluate performance across multiple tasks.
arXiv Detail & Related papers (2025-11-23T14:47:24Z) - BasicAVSR: Arbitrary-Scale Video Super-Resolution via Image Priors and Enhanced Motion Compensation [70.27358326228399]
We propose a BasicAVSR for Arbitrary-scale video super-resolution (AVSR)<n>AVSR aims to enhance the resolution of video frames, potentially various scaling factors.<n>We show that BasicAVSR significantly outperforms existing methods in terms of super-resolution quality, generalization ability, and inference speed.
arXiv Detail & Related papers (2025-10-30T05:08:45Z) - MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning [91.90342432541138]
Scaling up model size and training data has advanced foundation models for instance-level perception.<n>High computational cost limits adoption on resource-constrained platforms.<n>We introduce a new benchmark for efficient segmentation on both high-performance computing platforms and mobile devices.
arXiv Detail & Related papers (2025-10-16T18:00:00Z) - DFYP: A Dynamic Fusion Framework with Spectral Channel Attention and Adaptive Operator learning for Crop Yield Prediction [18.24061967822792]
DFYP is a novel Dynamic Fusion framework for crop Yield Prediction.<n>It combines spectral channel attention, edge-adaptive spatial modeling and a learnable fusion mechanism.<n> DFYP consistently outperforms current state-of-the-art baselines in RMSE, MAE, and R2.
arXiv Detail & Related papers (2025-07-08T10:24:04Z) - Dual Atrous Separable Convolution for Improving Agricultural Semantic Segmentation [2.3636539018632616]
This study proposes an efficient image segmentation method for precision agriculture.<n>A novel Dual Atrous Separable Convolution (DAS Conv) module is integrated within the DeepLabV3-based segmentation framework.<n>It achieves more than 66% improvement in efficiency when considering the trade-off between model complexity and performance.
arXiv Detail & Related papers (2025-06-27T18:37:43Z) - COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation [21.35438761539206]
Vision-and-Language Navigation (VLN) tasks have gained prominence within artificial intelligence research due to their potential application in fields like home assistants.<n>We propose a novel architecture with the COmbination of Selective MemOrization (COSMO) to achieve both high performance and low computational costs.
arXiv Detail & Related papers (2025-03-31T13:24:10Z) - ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships.<n>Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands.<n>We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network [8.127449025802436]
We present a novel recurrent multi-view stereo network based on long short-term memory (LSTM) with adaptive aggregation, namely AA-RMVSNet.
We firstly introduce an intra-view aggregation module to adaptively extract image features by using context-aware convolution and multi-scale aggregation.
We propose an inter-view cost volume aggregation module for adaptive pixel-wise view aggregation, which is able to preserve better-matched pairs among all views.
arXiv Detail & Related papers (2021-08-09T06:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.