RecursiveDet: End-to-End Region-based Recursive Object Detection
- URL: http://arxiv.org/abs/2307.13619v1
- Date: Tue, 25 Jul 2023 16:22:58 GMT
- Title: RecursiveDet: End-to-End Region-based Recursive Object Detection
- Authors: Jing Zhao, Li Sun, Qingli Li
- Abstract summary: Region-based object detectors like Sparse R-CNN usually have multiple cascade bounding box decoding stages.
In this paper, we find the general setting of decoding stages is actually redundant.
The RecusiveDet is able to achieve obvious performance boosts with even fewer model parameters.
- Score: 19.799892459080485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end region-based object detectors like Sparse R-CNN usually have
multiple cascade bounding box decoding stages, which refine the current
predictions according to their previous results. Model parameters within each
stage are independent, evolving a huge cost. In this paper, we find the general
setting of decoding stages is actually redundant. By simply sharing parameters
and making a recursive decoder, the detector already obtains a significant
improvement. The recursive decoder can be further enhanced by positional
encoding (PE) of the proposal box, which makes it aware of the exact locations
and sizes of input bounding boxes, thus becoming adaptive to proposals from
different stages during the recursion. Moreover, we also design
centerness-based PE to distinguish the RoI feature element and dynamic
convolution kernels at different positions within the bounding box. To validate
the effectiveness of the proposed method, we conduct intensive ablations and
build the full model on three recent mainstream region-based detectors. The
RecusiveDet is able to achieve obvious performance boosts with even fewer model
parameters and slightly increased computation cost. Codes are available at
https://github.com/bravezzzzzz/RecursiveDet.
Related papers
- CPR++: Object Localization via Single Coarse Point Supervision [55.8671776333499]
coarse point refinement (CPR) is first attempt to alleviate semantic variance from an algorithmic perspective.
CPR reduces semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.
CPR++ can obtain scale information and further reduce the semantic variance in a global region.
arXiv Detail & Related papers (2024-01-30T17:38:48Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Adaptive Recursive Circle Framework for Fine-grained Action Recognition [95.51097674917851]
How to model fine-grained spatial-temporal dynamics in videos has been a challenging problem for action recognition.
Most existing methods generate features of a layer in a pure feedforward manner.
We propose an Adaptive Recursive Circle framework, a fine-grained decorator for pure feedforward layers.
arXiv Detail & Related papers (2021-07-25T14:24:29Z) - Omni-supervised Point Cloud Segmentation via Gradual Receptive Field
Component Reasoning [41.83979510282989]
We bring the first omni-scale supervision method to point cloud segmentation via the proposed gradual Receptive Field Component Reasoning (RFCR)
Our method brings new state-of-the-art performances for S3DIS as well as Semantic3D and ranks the 1st in the ScanNet benchmark among all the point-based methods.
arXiv Detail & Related papers (2021-05-21T08:32:02Z) - Recursively Refined R-CNN: Instance Segmentation with Self-RoI
Rebalancing [2.4634850020708616]
We propose Recursively Refined R-CNN ($R3$-CNN) which avoids duplicates by introducing a loop mechanism instead.
Our experiments highlight the specific encoding of the loop mechanism in the weights, requiring its usage at inference time.
The architecture is able to surpass the recently proposed HTC model, while reducing the number of parameters significantly.
arXiv Detail & Related papers (2021-04-03T07:25:33Z) - DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic
Convolution [136.7261709896713]
We propose a data-driven approach that generates the appropriate convolution kernels to apply in response to the nature of the instances.
The proposed method achieves promising results on both ScanetNetV2 and S3DIS.
It also improves inference speed by more than 25% over the current state-of-the-art.
arXiv Detail & Related papers (2020-11-26T14:56:57Z) - Fibonacci and k-Subsecting Recursive Feature Elimination [2.741266294612776]
Feature selection is a data mining task with the potential of speeding up classification algorithms.
We propose two novel algorithms called Fibonacci- and k-Subsecting Recursive Feature Elimination.
Results show that Fibonacci and k-Subsecting Recursive Feature Elimination are capable of selecting a smaller subset of features much faster than standard RFE.
arXiv Detail & Related papers (2020-07-29T15:53:04Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.