NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection
- URL: http://arxiv.org/abs/2409.09424v3
- Date: Tue, 07 Jan 2025 11:37:57 GMT
- Title: NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection
- Authors: Yechan Kim, SooYeon Kim, Moongu Jeon,
- Abstract summary: This letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection.
We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images.
Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells.
- Score: 11.564184330068775
- License:
- Abstract: Data augmentation has shown significant advancements in computer vision to improve model performance over the years, particularly in scenarios with limited and insufficient data. Currently, most studies focus on adjusting the image or its features to expand the size, quality, and variety of samples during training in various tasks including object detection. However, we argue that it is necessary to investigate bounding box transformations as a data augmentation technique rather than image-level transformations, especially in aerial imagery due to potentially inconsistent bounding box annotations. Hence, this letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We call this augmentation strategy NBBOX (Noise Injection into Bounding Box). We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells and it is more time-efficient than other state-of-the-art augmentation strategies.
Related papers
- A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images [11.217630579076237]
Few-shot object detection (FSOD) has garnered significant research attention in the field of remote sensing.
We propose a novel FSOD method for remote sensing images called Few-shot Oriented object detection with Memorable Contrastive learning (FOMC)
Specifically, we employ oriented bounding boxes instead of traditional horizontal bounding boxes to learn a better feature representation for arbitrary-oriented aerial objects.
arXiv Detail & Related papers (2024-03-20T08:15:18Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images [1.662438436885552]
Multi-modal fusion has been determined to enhance the accuracy by fusing data from multiple modalities.
We propose a novel multi-modal fusion strategy for mapping relationships between different channels at the early stage.
By addressing fusion in the early stage, as opposed to mid or late-stage methods, our method achieves competitive and even superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-10-21T00:56:11Z) - Transformation-Invariant Network for Few-Shot Object Detection in Remote
Sensing Images [15.251042369061024]
Few-shot object detection (FSOD) relies on a large amount of labeled data for training.
Scale and orientation variations of objects in remote sensing images pose significant challenges to existing FSOD methods.
We propose integrating a feature pyramid network and utilizing prototype features to enhance query features.
arXiv Detail & Related papers (2023-03-13T02:21:38Z) - Aerial Image Object Detection With Vision Transformer Detector (ViTDet) [0.0]
Vision Transformer Detector (ViTDet) was proposed to extract multi-scale features for object detection.
ViTDet's simple design achieves good performance on natural scene images and can be easily embedded into any detector architecture.
Our results show that ViTDet can consistently outperform its convolutional neural network counterparts on horizontal bounding box (HBB) object detection.
arXiv Detail & Related papers (2023-01-28T02:25:30Z) - Object Detection in Aerial Images: What Improves the Accuracy? [9.857292888257144]
deep learning-based object detection approaches have been actively explored for the problem of object detection in aerial images.
In this work, we investigate the impact of Faster R-CNN for aerial object detection and explore numerous strategies to improve its performance for aerial images.
arXiv Detail & Related papers (2022-01-21T16:22:48Z) - AdaZoom: Adaptive Zoom Network for Multi-Scale Object Detection in Large
Scenes [57.969186815591186]
Detection in large-scale scenes is a challenging problem due to small objects and extreme scale variation.
We propose a novel Adaptive Zoom (AdaZoom) network as a selective magnifier with flexible shape and focal length to adaptively zoom the focus regions for object detection.
arXiv Detail & Related papers (2021-06-19T03:30:22Z) - Robust Data Hiding Using Inverse Gradient Attention [82.73143630466629]
In the data hiding task, each pixel of cover images should be treated differently since they have divergent tolerabilities.
We propose a novel deep data hiding scheme with Inverse Gradient Attention (IGA), combing the ideas of adversarial learning and attention mechanism.
Empirically, extensive experiments show that the proposed model outperforms the state-of-the-art methods on two prevalent datasets.
arXiv Detail & Related papers (2020-11-21T19:08:23Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z) - SCRDet++: Detecting Small, Cluttered and Rotated Objects via
Instance-Level Feature Denoising and Rotation Loss Smoothing [131.04304632759033]
Small and cluttered objects are common in real-world which are challenging for detection.
In this paper, we first innovatively introduce the idea of denoising to object detection.
Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects.
arXiv Detail & Related papers (2020-04-28T06:03:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.