R-FCN: Object Detection via Region-based Fully Convolutional Networks
- URL: http://arxiv.org/abs/1605.06409v3
- Date: Mon, 11 Dec 2023 13:28:51 GMT
- Title: R-FCN: Object Detection via Region-based Fully Convolutional Networks
- Authors: Jifeng Dai, Yi Li, Kaiming He, Jian Sun
- Abstract summary: We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
- Score: 87.62557357527861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present region-based, fully convolutional networks for accurate and
efficient object detection. In contrast to previous region-based detectors such
as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of
times, our region-based detector is fully convolutional with almost all
computation shared on the entire image. To achieve this goal, we propose
position-sensitive score maps to address a dilemma between
translation-invariance in image classification and translation-variance in
object detection. Our method can thus naturally adopt fully convolutional image
classifier backbones, such as the latest Residual Networks (ResNets), for
object detection. We show competitive results on the PASCAL VOC datasets (e.g.,
83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is
achieved at a test-time speed of 170ms per image, 2.5-20x faster than the
Faster R-CNN counterpart. Code is made publicly available at:
https://github.com/daijifeng001/r-fcn
Related papers
- DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Learning cross space mapping via DNN using large scale click-through
logs [38.94796244812248]
The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN)
We propose a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network.
Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2023-02-26T09:00:35Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN.
In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner.
The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z) - Single Object Tracking through a Fast and Effective Single-Multiple
Model Convolutional Neural Network [0.0]
Recent state-of-the-art (SOTA) approaches are proposed based on taking a matching network with a heavy structure to distinguish the target from other objects in the area.
In this article, a special architecture is proposed based on which in contrast to the previous approaches, it is possible to identify the object location in a single shot.
The presented tracker performs comparatively with the SOTA in challenging situations while having a super speed compared to them (up to $120 FPS$ on 1080ti)
arXiv Detail & Related papers (2021-03-28T11:02:14Z) - Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection [99.16162624992424]
We devise a simple but effective voxel-based framework, named Voxel R-CNN.
By taking full advantage of voxel features in a two stage approach, our method achieves comparable detection accuracy with state-of-the-art point-based models.
Our results show that Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, emphi.e, at a speed of 25 FPS on an NVIDIA 2080 Ti GPU.
arXiv Detail & Related papers (2020-12-31T17:02:46Z) - Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture.
We show consistent improvements in accuracy and learning convergence over the baseline.
Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.