Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection
- URL: http://arxiv.org/abs/2403.09313v1
- Date: Thu, 14 Mar 2024 12:03:28 GMT
- Title: Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection
- Authors: Martin Aubard, László Antal, Ana Madureira, Erika Ábrahám,
- Abstract summary: We present YOLOX-ViT, a novel object detection model, and investigate the efficacy of knowledge distillation for model size reduction.
We introduce a new side-scan sonar image dataset, and use it to evaluate our object detector's performance.
- Score: 0.40498500266986387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we present YOLOX-ViT, a novel object detection model, and investigate the efficacy of knowledge distillation for model size reduction without sacrificing performance. Focused on underwater robotics, our research addresses key questions about the viability of smaller models and the impact of the visual transformer layer in YOLOX. Furthermore, we introduce a new side-scan sonar image dataset, and use it to evaluate our object detector's performance. Results show that knowledge distillation effectively reduces false positives in wall detection. Additionally, the introduced visual transformer layer significantly improves object detection accuracy in the underwater environment. The source code of the knowledge distillation in the YOLOX-ViT is at https://github.com/remaro-network/KD-YOLOX-ViT.
Related papers
- SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery [5.639904484784127]
Experimental results demonstrate that SOD-YOLO significantly improves detection performance.<n>SOD-YOLO is a practical and efficient solution for small object detection in UAV imagery.
arXiv Detail & Related papers (2025-07-17T02:04:54Z) - MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View [0.0]
We propose a novel object detection network Multi-scale Context Aggregation and Scale-adaptive Fusion YOLO (MASF-YOLO)
To tackle the difficulty of detecting small objects in UAV images, we design a Multi-scale Feature Aggregation Module (MFAM), which significantly improves the detection accuracy of small objects.
Thirdly, we introduce a Dimension-Aware Selective Integration Module (DASI), which further enhances multi-scale feature fusion capabilities.
arXiv Detail & Related papers (2025-04-25T07:43:33Z) - RS-YOLOX: A High Precision Detector for Object Detection in Satellite Remote Sensing Images [20.582343125606403]
This article proposes an improved YOLOX model for satellite remote sensing image automatic detection.
To strengthen the feature learning ability of the network, we used Efficient Channel Attention (ECA) in the backbone network of YOLOX.
To balance the numbers of positive and negative samples in training, we used the Varifocal Loss function.
To obtain a high-performance remote sensing object detector, we combined the trained model with an open-source framework called Slicing Aided Hyper Inference.
arXiv Detail & Related papers (2025-02-05T03:05:33Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.
In this paper, we investigate how detection performance varies across model backbones, types, and datasets.
We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection [0.0]
Existing detection methods for insulator defect identification from unmanned aerial vehicles struggle with complex background scenes and small objects.
This paper proposes a new attention-based foundation architecture, YOLO-ELA, to address this issue.
Experimental results on high-resolution UAV images show that our method achieved a state-of-the-art performance of 96.9% mAP0.5 and a real-time detection speed of 74.63 frames per second.
arXiv Detail & Related papers (2024-10-15T16:00:01Z) - Spatial Transformer Network YOLO Model for Agricultural Object Detection [0.3124884279860061]
We propose a new method that integrates spatial transformer networks (STNs) into YOLO to improve performance.
The proposed STN-YOLO aims to enhance the model's effectiveness by focusing on important areas of the image.
We apply the STN-YOLO on benchmark datasets for Agricultural object detection as well as a new dataset from a state-of-the-art plant phenotyping greenhouse facility.
arXiv Detail & Related papers (2024-07-31T14:53:41Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Object-centric Cross-modal Feature Distillation for Event-based Object
Detection [87.50272918262361]
RGB detectors still outperform event-based detectors due to sparsity of the event data and missing visual details.
We develop a novel knowledge distillation approach to shrink the performance gap between these two modalities.
We show that object-centric distillation allows to significantly improve the performance of the event-based student object detector.
arXiv Detail & Related papers (2023-11-09T16:33:08Z) - A Study on Tiny YOLO for Resource Constrained Xray Threat Detection [0.0]
This paper implements and analyzes multiple networks with the goal of understanding their suitability for edge device applications such as X-ray threat detection.
We use the state-of-the-art YOLO object detection model to solve this task of detecting threats in security baggage screening images.
arXiv Detail & Related papers (2023-09-27T12:02:33Z) - Learning Heavily-Degraded Prior for Underwater Object Detection [59.5084433933765]
This paper seeks transferable prior knowledge from detector-friendly images.
It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps.
Our method with higher speeds and less parameters still performs better than transformer-based detectors.
arXiv Detail & Related papers (2023-08-24T12:32:46Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - DeepSeaNet: Improving Underwater Object Detection using EfficientDet [0.0]
This project involves implementing and evaluating various object detection models on an annotated underwater dataset.
The dataset comprises annotated image sequences of fish, crabs, starfish, and other aquatic animals captured in Limfjorden water with limited visibility.
I compare the results of YOLOv3 (31.10% mean Average Precision (mAP)), YOLOv4 (83.72% mAP), YOLOv5 (97.6%), YOLOv8 (98.20%), EfficientDet (98.56% mAP) and Detectron2 (95.20% mAP) on the same dataset.
arXiv Detail & Related papers (2023-05-26T13:41:35Z) - SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video
Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method.
We modernize the 3D convolutional backbone by introducing multi-head self-attention modules.
In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray
Image [7.970559381165446]
We propose a weld defect detection method based on convolution neural network (CNN), namely Lighter and Faster YOLO (LF-YOLO)
To improve the performance of detection network, we propose an efficient feature extraction (EFE) module.
Experimental results show that our weld defect network achieves satisfactory balance between performance and consumption, and reaches 92.9 mAP50 with 61.5 FPS.
arXiv Detail & Related papers (2021-10-28T12:19:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.