Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection
- URL: http://arxiv.org/abs/2407.15143v2
- Date: Thu, 8 Aug 2024 05:15:07 GMT
- Title: Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection
- Authors: Yechan Kim, JongHyun Park, SooYeon Kim, Moongu Jeon,
- Abstract summary: We propose DBF (Dynamic Backbone Freezing) for feature backbone fine-tuning on remote sensing object detection.
Our method aims to handle the dilemma of whether the backbone should extract low-level generic features or possess specific knowledge of the remote sensing domain.
Our approach enables more accurate model learning while substantially reducing computational costs.
- Score: 10.896464615994494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, numerous methods have achieved impressive performance in remote sensing object detection, relying on convolution or transformer architectures. Such detectors typically have a feature backbone to extract useful features from raw input images. For the remote sensing domain, a common practice among current detectors is to initialize the backbone with pre-training on ImageNet consisting of natural scenes. Fine-tuning the backbone is then typically required to generate features suitable for remote-sensing images. However, this could hinder the extraction of basic visual features in long-term training, thus restricting performance improvement. To mitigate this issue, we propose a novel method named DBF (Dynamic Backbone Freezing) for feature backbone fine-tuning on remote sensing object detection. Our method aims to handle the dilemma of whether the backbone should extract low-level generic features or possess specific knowledge of the remote sensing domain, by introducing a module called 'Freezing Scheduler' to dynamically manage the update of backbone features during training. Extensive experiments on DOTA and DIOR-R show that our approach enables more accurate model learning while substantially reducing computational costs. Our method can be seamlessly adopted without additional effort due to its straightforward design.
Related papers
- MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection [36.478530086163744]
We propose a novel Mutually optimizing pre-training framework for remote sensing object Detection, dubbed as MutDet.
MutDet fuses the object embeddings and detector features bidirectionally in the last encoder layer, enhancing their information interaction.
Experiments on various settings show new state-of-the-art transfer performance.
arXiv Detail & Related papers (2024-07-13T15:28:15Z) - Generic Knowledge Boosted Pre-training For Remote Sensing Images [46.071496675604884]
Generic Knowledge Boosted Remote Sensing Pre-training (GeRSP) is a novel remote sensing pre-training framework.
GeRSP learns robust representations from remote sensing and natural images for remote sensing understanding tasks.
We show that GeRSP can effectively learn robust representations in a unified manner, improving the performance of remote sensing downstream tasks.
arXiv Detail & Related papers (2024-01-09T15:36:07Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - An Empirical Study of Remote Sensing Pretraining [117.90699699469639]
We conduct an empirical study of remote sensing pretraining (RSP) on aerial images.
RSP can help deliver distinctive performances in scene recognition tasks.
RSP mitigates the data discrepancies of traditional ImageNet pretraining on RS images, but it may still suffer from task discrepancies.
arXiv Detail & Related papers (2022-04-06T13:38:11Z) - Proper Reuse of Image Classification Features Improves Object Detection [4.240984948137734]
A common practice in transfer learning is to initialize the downstream model weights by pre-training on a data-abundant upstream task.
Recent works show this is not strictly necessary under longer training regimes and provide recipes for training the backbone from scratch.
We show that an extreme form of knowledge preservation -- freezing the classifier-d backbone -- consistently improves many different detection models.
arXiv Detail & Related papers (2022-04-01T14:44:47Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented
Object Detection in Remote Sensing Images [0.9462808515258465]
In this paper, we discuss the role of discriminative features in object detection.
We then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy.
We show that our method achieves superior detection performance compared with many state-of-the-art approaches.
arXiv Detail & Related papers (2021-01-18T02:31:09Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z) - Few-shot Object Detection on Remote Sensing Images [11.40135025181393]
We introduce a few-shot learning-based method for object detection on remote sensing images.
We build our few-shot object detection model upon YOLOv3 architecture and develop a multi-scale object detection framework.
arXiv Detail & Related papers (2020-06-14T07:18:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.