R-FCN: Object Detection via Region-based Fully Convolutional Networks
- URL: http://arxiv.org/abs/1605.06409v3
- Date: Mon, 11 Dec 2023 13:28:51 GMT
- Title: R-FCN: Object Detection via Region-based Fully Convolutional Networks
- Authors: Jifeng Dai, Yi Li, Kaiming He, Jian Sun
- Abstract summary: We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
- Score: 87.62557357527861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present region-based, fully convolutional networks for accurate and
efficient object detection. In contrast to previous region-based detectors such
as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of
times, our region-based detector is fully convolutional with almost all
computation shared on the entire image. To achieve this goal, we propose
position-sensitive score maps to address a dilemma between
translation-invariance in image classification and translation-variance in
object detection. Our method can thus naturally adopt fully convolutional image
classifier backbones, such as the latest Residual Networks (ResNets), for
object detection. We show competitive results on the PASCAL VOC datasets (e.g.,
83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is
achieved at a test-time speed of 170ms per image, 2.5-20x faster than the
Faster R-CNN counterpart. Code is made publicly available at:
https://github.com/daijifeng001/r-fcn
Related papers
- Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection [74.01846006894635]
This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection.
We build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful.
arXiv Detail & Related papers (2025-01-07T13:30:54Z) - DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Learning cross space mapping via DNN using large scale click-through
logs [38.94796244812248]
The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN)
We propose a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network.
Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2023-02-26T09:00:35Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN.
In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner.
The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z) - Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture.
We show consistent improvements in accuracy and learning convergence over the baseline.
Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.