Improving Apple Object Detection with Occlusion-Enhanced Distillation
- URL: http://arxiv.org/abs/2409.01573v2
- Date: Wed, 30 Oct 2024 02:36:18 GMT
- Title: Improving Apple Object Detection with Occlusion-Enhanced Distillation
- Authors: Liang Geng,
- Abstract summary: Apples growing in natural environments often face severe visual obstructions from leaves and branches.
We introduce a technique called "Occlusion-Enhanced Distillation" (OED) to regularize the learning of semantically aligned features on occluded datasets.
Our method significantly outperforms current state-of-the-art techniques through extensive comparative experiments.
- Score: 1.0049237739132246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Apples growing in natural environments often face severe visual obstructions from leaves and branches. This significantly increases the risk of false detections in object detection tasks, thereby escalating the challenge. Addressing this issue, we introduce a technique called "Occlusion-Enhanced Distillation" (OED). This approach utilizes occlusion information to regularize the learning of semantically aligned features on occluded datasets and employs Exponential Moving Average (EMA) to enhance training stability. Specifically, we first design an occlusion-enhanced dataset that integrates Grounding DINO and SAM methods to extract occluding elements such as leaves and branches from each sample, creating occlusion examples that reflect the natural growth state of fruits. Additionally, we propose a multi-scale knowledge distillation strategy, where the student network uses images with increased occlusions as inputs, while the teacher network employs images without natural occlusions. Through this setup, the strategy guides the student network to learn from the teacher across scales of semantic and local features alignment, effectively narrowing the feature distance between occluded and non-occluded targets and enhancing the robustness of object detection. Lastly, to improve the stability of the student network, we introduce the EMA strategy, which aids the student network in learning more generalized feature expressions that are less affected by the noise of individual image occlusions. Our method significantly outperforms current state-of-the-art techniques through extensive comparative experiments.
Related papers
- MRIFE: A Mask-Recovering and Interactive-Feature-Enhancing Semantic Segmentation Network For Relic Landslide Detection [7.6822321138894765]
Relic landslide, formed over a long period, possess the potential for reactivation, making them a hazardous geological phenomenon.
High-resolution remote sensing images for relic landslides face many challenges, including the object visual blur problem.
A semantic segmentation model, termed mask-recovering and interactive-feature-enhancing (MRIFE), is proposed for more efficient feature extraction and separation.
The proposed MRIFE is evaluated on a real relic landslide dataset, and experimental results show that it greatly improves the performance of relic landslide detection.
arXiv Detail & Related papers (2024-11-26T07:15:50Z) - Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms.
PointACL is an attention-driven contrastive learning framework designed to address these limitations.
Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z) - SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy [55.570183323356964]
We propose a novel Surrounding-Aware Network, namely SurANet, for concealed object detection.
We enhance the semantics of feature maps using differential fusion of surrounding features to highlight concealed objects.
Next, a Surrounding-Aware Contrastive Loss is applied to identify the concealed object via learning surrounding feature maps contrastively.
arXiv Detail & Related papers (2024-10-09T13:02:50Z) - Deep Generative Adversarial Network for Occlusion Removal from a Single Image [3.5639148953570845]
We propose a fully automatic, two-stage convolutional neural network for fence segmentation and occlusion completion.
We leverage generative adversarial networks (GANs) to synthesize realistic content, including both structure and texture, in a single shot for inpainting.
arXiv Detail & Related papers (2024-09-20T06:00:45Z) - Semantic Deep Hiding for Robust Unlearnable Examples [33.68037533119807]
Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration.
We propose a Deep Hiding scheme that adaptively hides semantic images enriched with high-level features.
Our proposed method exhibits outstanding robustness for unlearnable examples, demonstrating its efficacy in preventing unauthorized data exploitation.
arXiv Detail & Related papers (2024-06-25T08:05:42Z) - Multilevel Saliency-Guided Self-Supervised Learning for Image Anomaly
Detection [15.212031255539022]
Anomaly detection (AD) is a fundamental task in computer vision.
We propose CutSwap, which leverages saliency guidance to incorporate semantic cues for augmentation.
CutSwap achieves state-of-the-art AD performance on two mainstream AD benchmark datasets.
arXiv Detail & Related papers (2023-11-30T08:03:53Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Learning Efficient Representations for Enhanced Object Detection on
Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.
Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images.
We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.