Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images
- URL: http://arxiv.org/abs/2108.08063v1
- Date: Wed, 18 Aug 2021 09:25:39 GMT
- Title: Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images
- Authors: Pourya Shamsolmoali, Jocelyn Chanussot, Masoumeh Zareapoor, Huiyu
Zhou, and Jie Yang
- Abstract summary: We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
- Score: 39.25541709228373
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Object detection is a challenging task in remote sensing because objects only
occupy a few pixels in the images, and the models are required to
simultaneously learn object locations and detection. Even though the
established approaches well perform for the objects of regular sizes, they
achieve weak performance when analyzing small ones or getting stuck in the
local minima (e.g. false object parts). Two possible issues stand in their way.
First, the existing methods struggle to perform stably on the detection of
small objects because of the complicated background. Second, most of the
standard methods used hand-crafted features, and do not work well on the
detection of objects parts of which are missing. We here address the above
issues and propose a new architecture with a multiple patch feature pyramid
network (MPFP-Net). Different from the current models that during training only
pursue the most discriminative patches, in MPFPNet the patches are divided into
class-affiliated subsets, in which the patches are related and based on the
primary loss function, a sequence of smooth loss functions are determined for
the subsets to improve the model for collecting small object parts. To enhance
the feature representation for patch selection, we introduce an effective
method to regularize the residual values and make the fusion transition layers
strictly norm-preserving. The network contains bottom-up and crosswise
connections to fuse the features of different scales to achieve better
accuracy, compared to several state-of-the-art object detection models. Also,
the developed architecture is more efficient than the baselines.
Related papers
- Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Rethinking the backbone architecture for tiny object detection [0.0]
Existing tiny object detection methods use standard deep neural networks as their backbone architecture.
We argue that such backbones are inappropriate for detecting tiny objects as they are designed for the classification of larger objects, and do not have the spatial resolution to identify small targets.
We design 'bottom-heavy' versions of backbones that allocate more resources to processing higher-resolution features without introducing any additional computational burden overall.
arXiv Detail & Related papers (2023-03-20T16:50:29Z) - CASAPose: Class-Adaptive and Semantic-Aware Multi-Object Pose Estimation [2.861848675707602]
We present a new single-stage architecture called CASAPose.
It determines 2D-3D correspondences for pose estimation of multiple different objects in RGB images in one pass.
It is fast and memory efficient, and achieves high accuracy for multiple objects.
arXiv Detail & Related papers (2022-10-11T10:20:01Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Hierarchical Complementary Learning for Weakly Supervised Object
Localization [12.104019927107517]
Weakly supervised object localization (WSOL) is a challenging problem which aims to localize objects with only image-level labels.
This paper proposes a Hierarchical Complementary Learning Network method (HCLNet) that helps the CNN to perform better classification and localization of objects on the images.
arXiv Detail & Related papers (2020-11-16T14:58:51Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Few-shot Object Detection on Remote Sensing Images [11.40135025181393]
We introduce a few-shot learning-based method for object detection on remote sensing images.
We build our few-shot object detection model upon YOLOv3 architecture and develop a multi-scale object detection framework.
arXiv Detail & Related papers (2020-06-14T07:18:10Z) - NETNet: Neighbor Erasing and Transferring Network for Better Single Shot
Object Detection [170.30694322460045]
We propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
A single-shot network called NETNet is constructed for scale-aware object detection.
arXiv Detail & Related papers (2020-01-18T15:21:29Z) - Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.