Related papers: R-FCN: Object Detection via Region-based Fully Convolutional Networks

R-FCN: Object Detection via Region-based Fully Convolutional Networks

URL: http://arxiv.org/abs/1605.06409v3
Date: Mon, 11 Dec 2023 13:28:51 GMT
Title: R-FCN: Object Detection via Region-based Fully Convolutional Networks
Authors: Jifeng Dai, Yi Li, Kaiming He, Jian Sun
Abstract summary: We present region-based, fully convolutional networks for accurate and efficient object detection. Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
Score: 87.62557357527861
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets), for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart. Code is made publicly available at: https://github.com/daijifeng001/r-fcn

Related papers

Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection [74.01846006894635]
This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection. We build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful.
arXiv Detail & Related papers (2025-01-07T13:30:54Z)
DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches. A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context. Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z)
Learning cross space mapping via DNN using large scale click-through logs [38.94796244812248]
The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN) We propose a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network. Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2023-02-26T09:00:35Z)
MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos. We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z)
Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN. In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner. The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z)
Single Object Tracking through a Fast and Effective Single-Multiple Model Convolutional Neural Network [0.0]
Recent state-of-the-art (SOTA) approaches are proposed based on taking a matching network with a heavy structure to distinguish the target from other objects in the area. In this article, a special architecture is proposed based on which in contrast to the previous approaches, it is possible to identify the object location in a single shot. The presented tracker performs comparatively with the SOTA in challenging situations while having a super speed compared to them (up to $120 FPS$ on 1080ti)
arXiv Detail & Related papers (2021-03-28T11:02:14Z)
Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection [99.16162624992424]
We devise a simple but effective voxel-based framework, named Voxel R-CNN. By taking full advantage of voxel features in a two stage approach, our method achieves comparable detection accuracy with state-of-the-art point-based models. Our results show that Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, emphi.e, at a speed of 25 FPS on an NVIDIA 2080 Ti GPU.
arXiv Detail & Related papers (2020-12-31T17:02:46Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.