1st Place Solutions for OpenImage2019 -- Object Detection and Instance
Segmentation
- URL: http://arxiv.org/abs/2003.07557v1
- Date: Tue, 17 Mar 2020 06:45:07 GMT
- Title: 1st Place Solutions for OpenImage2019 -- Object Detection and Instance
Segmentation
- Authors: Yu Liu, Guanglu Song, Yuhang Zang, Yan Gao, Enze Xie, Junjie Yan, Chen
Change Loy, Xiaogang Wang
- Abstract summary: This article introduces the solutions of the two champion teams, MMfruit' for the detection track and MMfruitSeg' for the segmentation track, in OpenImage Challenge 2019.
It is commonly known that for an object detector, the shared feature at the end of the backbone is not appropriate for both classification and regression.
We propose the Decoupling Head (DH) to disentangle the object classification and regression via the self-learned optimal feature extraction.
- Score: 116.25081559037872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article introduces the solutions of the two champion teams, `MMfruit'
for the detection track and `MMfruitSeg' for the segmentation track, in
OpenImage Challenge 2019. It is commonly known that for an object detector, the
shared feature at the end of the backbone is not appropriate for both
classification and regression, which greatly limits the performance of both
single stage detector and Faster RCNN \cite{ren2015faster} based detector. In
this competition, we observe that even with a shared feature, different
locations in one object has completely inconsistent performances for the two
tasks. \textit{E.g. the features of salient locations are usually good for
classification, while those around the object edge are good for regression.}
Inspired by this, we propose the Decoupling Head (DH) to disentangle the object
classification and regression via the self-learned optimal feature extraction,
which leads to a great improvement. Furthermore, we adjust the soft-NMS
algorithm to adj-NMS to obtain stable performance improvement. Finally, a
well-designed ensemble strategy via voting the bounding box location and
confidence is proposed. We will also introduce several training/inferencing
strategies and a bag of tricks that give minor improvement. Given those masses
of details, we train and aggregate 28 global models with various backbones,
heads and 3+2 expert models, and achieves the 1st place on the OpenImage 2019
Object Detection Challenge on the both public and private leadboards. Given
such good instance bounding box, we further design a simple instance-level
semantic segmentation pipeline and achieve the 1st place on the segmentation
challenge.
Related papers
- Instance Segmentation under Occlusions via Location-aware Copy-Paste
Data Augmentation [8.335108002480068]
MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context.
This challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures.
Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution.
arXiv Detail & Related papers (2023-10-27T07:44:25Z) - 1st Place Solution of The Robust Vision Challenge (RVC) 2022 Semantic
Segmentation Track [67.56316745239629]
This report describes the winning solution to the semantic segmentation task of the Robust Vision Challenge on ECCV 2022.
Our method adopts the FAN-B-Hybrid model as the encoder and uses Segformer as the segmentation framework.
The proposed method could serve as a strong baseline for the multi-domain segmentation task and benefit future works.
arXiv Detail & Related papers (2022-10-23T20:52:22Z) - Group R-CNN for Weakly Semi-supervised Object Detection with Points [18.720915213798623]
We propose an effective point-to-box regressor: Group R-CNN.
Group R-CNN first uses instance-level proposal grouping to generate a group of proposals for each point annotation.
We show that Group R-CNN significantly outperforms the prior method Point DETR by 3.9 mAP with 5% well-labeled images.
arXiv Detail & Related papers (2022-05-12T07:17:54Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z) - Labels4Free: Unsupervised Segmentation using StyleGAN [40.39780497423365]
We propose an unsupervised segmentation framework for StyleGAN generated objects.
We report comparable results against state-of-the-art supervised segmentation networks.
arXiv Detail & Related papers (2021-03-27T18:59:22Z) - CompFeat: Comprehensive Feature Aggregation for Video Instance
Segmentation [67.17625278621134]
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video.
Previous approaches only utilize single-frame features for the detection, segmentation, and tracking of objects.
We propose a novel comprehensive feature aggregation approach (CompFeat) to refine features at both frame-level and object-level with temporal and spatial context information.
arXiv Detail & Related papers (2020-12-07T00:31:42Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z) - Deep Variational Instance Segmentation [7.334808870313923]
State-of-the-art algorithms often employ two separate stages, the first one generating object proposals and the second one recognizing and refining the boundaries.
We propose a novel algorithm that directly utilizes a fully convolutional network (FCN) to predict instance labels.
arXiv Detail & Related papers (2020-07-22T17:57:49Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Revisiting the Sibling Head in Object Detector [24.784483589579896]
This paper provides the observation that the spatial misalignment between the two object functions in the sibling head can considerably hurt the training process.
Considering the classification and regression, TSD decouples them from the spatial dimension by generating two disentangled proposals for them.
Surprisingly, this simple design can boost all backbones and models on both MS COCO and Google OpenImage consistently by 3% mAP.
arXiv Detail & Related papers (2020-03-17T05:21:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.