Related papers: Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

URL: http://arxiv.org/abs/2011.12450v2
Date: Mon, 26 Apr 2021 14:20:03 GMT
Title: Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Authors: Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, Ping Luo
Abstract summary: We present Sparse R-CNN, a purely sparse method for object detection in images. Final predictions are directly output without non-maximum suppression post-procedure. We hope our work could inspire re-thinking the convention of dense prior in object detectors.
Score: 77.9701193170127
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as $k$ anchor boxes pre-defined on all grids of image feature map of size $H\times W$. In our method, however, a fixed sparse set of learned object proposals, total length of $N$, are provided to object recognition head to perform classification and location. By eliminating $HWk$ (up to hundreds of thousands) hand-designed object candidates to $N$ (e.g. 100) learnable proposals, Sparse R-CNN completely avoids all efforts related to object candidates design and many-to-one label assignment. More importantly, final predictions are directly output without non-maximum suppression post-procedure. Sparse R-CNN demonstrates accuracy, run-time and training convergence performance on par with the well-established detector baselines on the challenging COCO dataset, e.g., achieving 45.0 AP in standard $3\times$ training schedule and running at 22 fps using ResNet-50 FPN model. We hope our work could inspire re-thinking the convention of dense prior in object detectors. The code is available at: https://github.com/PeizeSun/SparseR-CNN.

Related papers

Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection [74.01846006894635]
This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection. We build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful.
arXiv Detail & Related papers (2025-01-07T13:30:54Z)
Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection [83.72430401516674]
GAKer is able to construct adversarial examples to any target class. Our method achieves an approximately $14.13%$ higher attack success rate for unknown classes.
arXiv Detail & Related papers (2024-07-17T03:24:09Z)
PG-RCNN: Semantic Surface Point Generation for 3D Object Detection [19.341260543105548]
Point Generation R-CNN (PG-RCNN) is a novel end-to-end detector for 3D object detection. Uses a jointly trained RoI point generation module to process contextual information of RoIs. For every generated point, PG-RCNN assigns a semantic feature that indicates the estimated foreground probability.
arXiv Detail & Related papers (2023-07-24T09:22:09Z)
Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN. In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner. The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z)
Probabilistic Robustness Analysis for DNNs based on PAC Learning [14.558877524991752]
We view a DNN as a function $boldsymbolf$ from inputs to outputs, and consider the local robustness property for a given input. We learn the score difference function $f_i-f_ell$ with respect to the target label $ell$ and attacking label $i$. Our framework can handle very large neural networks like ResNet152 with $6.5$M neurons, and often generates adversarial examples.
arXiv Detail & Related papers (2021-01-25T14:10:52Z)
OneNet: Towards End-to-End One-Stage Object Detection [39.445348555252785]
Existing one-stage object detectors assign labels by only location cost. Without classification cost, sole location cost leads to redundant boxes of high confidence scores in inference. To design an end-to-end one-stage object detector, we propose Minimum Cost Assignment. OneNet achieves 35.0 AP/80 FPS and 37.7 AP/50 FPS with image size of 512 pixels.
arXiv Detail & Related papers (2020-12-10T16:15:19Z)
Corner Proposal Network for Anchor-free, Two-stage Object Detection [174.59360147041673]
The goal of object detection is to determine the class and location of objects in an image. This paper proposes a novel anchor-free, two-stage framework which first extracts a number of object proposals. We demonstrate that these two stages are effective solutions for improving recall and precision.
arXiv Detail & Related papers (2020-07-27T19:04:57Z)
FCOS: A simple and strong anchor-free object detector [111.87691210818194]
We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free.
arXiv Detail & Related papers (2020-06-14T01:03:39Z)
Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation [51.17232267143098]
We propose a novel system named Disp R-CNN for 3D object detection from stereo images. We use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds. Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision.
arXiv Detail & Related papers (2020-04-07T17:48:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.