Should I Look at the Head or the Tail? Dual-awareness Attention for
Few-Shot Object Detection
- URL: http://arxiv.org/abs/2102.12152v1
- Date: Wed, 24 Feb 2021 09:17:27 GMT
- Title: Should I Look at the Head or the Tail? Dual-awareness Attention for
Few-Shot Object Detection
- Authors: Tung-I Chen, Yueh-Cheng Liu, Hung-Ting Su, Yu-Cheng Chang, Yu-Hsiang
Lin, Jia-Fong Yeh, Winston H. Hsu
- Abstract summary: We propose a novel Dual-Awareness-Attention (DAnA), which captures the pairwise spatial relationship cross the support and query images.
Our DAnA component is adaptable to various existing object detection networks and boosts FSOD performance by paying attention to specific semantics.
Experimental results demonstrate that DAnA significantly boosts (48% and 125% relatively) object detection performance on the COCO benchmark.
- Score: 20.439719842851744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While recent progress has significantly boosted few-shot classification (FSC)
performance, few-shot object detection (FSOD) remains challenging for modern
learning systems. Existing FSOD systems follow FSC approaches, neglect the
problem of spatial misalignment and the risk of information entanglement, and
result in low performance. Observing this, we propose a novel
Dual-Awareness-Attention (DAnA), which captures the pairwise spatial
relationship cross the support and query images. The generated
query-position-aware support features are robust to spatial misalignment and
used to guide the detection network precisely. Our DAnA component is adaptable
to various existing object detection networks and boosts FSOD performance by
paying attention to specific semantics conditioned on the query. Experimental
results demonstrate that DAnA significantly boosts (48% and 125% relatively)
object detection performance on the COCO benchmark. By equipping DAnA,
conventional object detection models, Faster-RCNN and RetinaNet, which are not
designed explicitly for few-shot learning, reach state-of-the-art performance.
Related papers
- D-YOLO a robust framework for object detection in adverse weather conditions [0.0]
Adverse weather conditions including haze, snow and rain lead to decline in image qualities, which often causes a decline in performance for deep-learning based detection networks.
To better integrate image restoration and object detection tasks, we designed a double-route network with an attention feature fusion module.
We also proposed a subnetwork to provide haze-free features to the detection network. Specifically, our D-YOLO improves the performance of the detection network by minimizing the distance between the clear feature extraction subnetwork and detection network.
arXiv Detail & Related papers (2024-03-14T09:57:15Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Transformation-Invariant Network for Few-Shot Object Detection in Remote
Sensing Images [15.251042369061024]
Few-shot object detection (FSOD) relies on a large amount of labeled data for training.
Scale and orientation variations of objects in remote sensing images pose significant challenges to existing FSOD methods.
We propose integrating a feature pyramid network and utilizing prototype features to enhance query features.
arXiv Detail & Related papers (2023-03-13T02:21:38Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - Dynamic Relevance Learning for Few-Shot Object Detection [6.550840743803705]
We propose a dynamic relevance learning model, which utilizes the relationship between all support images and Region of Interest (RoI) on the query images to construct a dynamic graph convolutional network (GCN)
The proposed model achieves the best overall performance, which shows its effectiveness of learning more generalized features.
arXiv Detail & Related papers (2021-08-04T18:29:42Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection [8.39479809973967]
Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce examples.
Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component.
We present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations.
arXiv Detail & Related papers (2020-11-30T10:21:32Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.