Revisiting Proposal-based Object Detection
- URL: http://arxiv.org/abs/2311.18512v1
- Date: Thu, 30 Nov 2023 12:40:23 GMT
- Title: Revisiting Proposal-based Object Detection
- Authors: Aritra Bhowmik, Martin R. Oswald, Pascal Mettes, Cees G. M. Snoek
- Abstract summary: We revisit the pipeline for detecting objects in images with proposals.
We solve a simple problem where we regress to the area of intersection between proposal and ground truth.
Our revisited approach comes with minimal changes to the detection pipeline and can be plugged into any existing method.
- Score: 59.97295544455179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper revisits the pipeline for detecting objects in images with
proposals. For any object detector, the obtained box proposals or queries need
to be classified and regressed towards ground truth boxes. The common solution
for the final predictions is to directly maximize the overlap between each
proposal and the ground truth box, followed by a winner-takes-all ranking or
non-maximum suppression. In this work, we propose a simple yet effective
alternative. For proposal regression, we solve a simpler problem where we
regress to the area of intersection between proposal and ground truth. In this
way, each proposal only specifies which part contains the object, avoiding a
blind inpainting problem where proposals need to be regressed beyond their
visual scope. In turn, we replace the winner-takes-all strategy and obtain the
final prediction by taking the union over the regressed intersections of a
proposal group surrounding an object. Our revisited approach comes with minimal
changes to the detection pipeline and can be plugged into any existing method.
We show that our approach directly improves canonical object detection and
instance segmentation architectures, highlighting the utility of
intersection-based regression and grouping.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment [17.485775402656127]
A base detector can outperform existing methods for single domain generalization by a good margin.
We introduce a method to align detections from multiple views, considering both classification and localization outputs.
Our approach is detector-agnostic and can be seamlessly applied to both single-stage and two-stage detectors.
arXiv Detail & Related papers (2024-05-23T12:29:25Z) - FindIt: Generalized Localization with Natural Language Queries [43.07139534653485]
FindIt is a simple and versatile framework that unifies a variety of visual grounding and localization tasks.
Key to our architecture is an efficient multi-scale fusion module that unifies the disparate localization requirements.
Our end-to-end trainable framework responds flexibly and accurately to a wide range of referring expression, localization or detection queries.
arXiv Detail & Related papers (2022-03-31T17:59:30Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - Mixup-CAM: Weakly-supervised Semantic Segmentation via Uncertainty
Regularization [73.03956876752868]
We propose a principled and end-to-end train-able framework to allow the network to pay attention to other parts of the object.
Specifically, we introduce the mixup data augmentation scheme into the classification network and design two uncertainty regularization terms to better interact with the mixup strategy.
arXiv Detail & Related papers (2020-08-03T21:19:08Z) - Novel Human-Object Interaction Detection via Adversarial Domain
Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations.
We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z) - 1st Place Solutions for OpenImage2019 -- Object Detection and Instance
Segmentation [116.25081559037872]
This article introduces the solutions of the two champion teams, MMfruit' for the detection track and MMfruitSeg' for the segmentation track, in OpenImage Challenge 2019.
It is commonly known that for an object detector, the shared feature at the end of the backbone is not appropriate for both classification and regression.
We propose the Decoupling Head (DH) to disentangle the object classification and regression via the self-learned optimal feature extraction.
arXiv Detail & Related papers (2020-03-17T06:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.