Instance-aware, Context-focused, and Memory-efficient Weakly Supervised
Object Detection
- URL: http://arxiv.org/abs/2004.04725v3
- Date: Wed, 21 Oct 2020 07:32:04 GMT
- Title: Instance-aware, Context-focused, and Memory-efficient Weakly Supervised
Object Detection
- Authors: Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Yong Jae Lee,
Alexander G. Schwing, Jan Kautz
- Abstract summary: We develop an instance-aware and context-focused unified framework for weakly supervised learning.
It employs an instance-aware self-training algorithm and a learnable Concrete DropBlock while devising a memory-efficient sequential batch back-propagation.
Our proposed method state-of-the-art results on COCO ($12.1% AP$, $24.8% AP_50$), VOC 2007 ($54.9% AP$), and VOC 2012 ($52.1% AP$)
- Score: 184.563345153682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised learning has emerged as a compelling tool for object
detection by reducing the need for strong supervision during training. However,
major challenges remain: (1) differentiation of object instances can be
ambiguous; (2) detectors tend to focus on discriminative parts rather than
entire objects; (3) without ground truth, object proposals have to be redundant
for high recalls, causing significant memory consumption. Addressing these
challenges is difficult, as it often requires to eliminate uncertainties and
trivial solutions. To target these issues we develop an instance-aware and
context-focused unified framework. It employs an instance-aware self-training
algorithm and a learnable Concrete DropBlock while devising a memory-efficient
sequential batch back-propagation. Our proposed method achieves
state-of-the-art results on COCO ($12.1\% ~AP$, $24.8\% ~AP_{50}$), VOC 2007
($54.9\% ~AP$), and VOC 2012 ($52.1\% ~AP$), improving baselines by great
margins. In addition, the proposed method is the first to benchmark ResNet
based models and weakly supervised video object detection. Code, models, and
more details will be made available at: https://github.com/NVlabs/wetectron.
Related papers
- Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images [11.217630579076237]
Few-shot object detection (FSOD) has garnered significant research attention in the field of remote sensing.
We propose a novel FSOD method for remote sensing images called Few-shot Oriented object detection with Memorable Contrastive learning (FOMC)
Specifically, we employ oriented bounding boxes instead of traditional horizontal bounding boxes to learn a better feature representation for arbitrary-oriented aerial objects.
arXiv Detail & Related papers (2024-03-20T08:15:18Z) - Towards End-to-End Unsupervised Saliency Detection with Self-Supervised
Top-Down Context [25.85453873366275]
We propose a self-supervised end-to-end salient object detection framework via top-down context.
We exploit the self-localization from the deepest feature to construct the location maps which are then leveraged to learn the most instructive segmentation guidance.
Our method achieves leading performance among the recent end-to-end methods and most of the multi-stage solutions.
arXiv Detail & Related papers (2023-10-14T08:43:22Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and
Segment Anything Model [14.080744645704751]
Open World Object Detection (OWOD) is a novel and challenging computer vision task.
We propose a simple yet effective learning strategy, namely Decoupled Objectness Learning (DOL), which divides the learning of these two boundaries into decoder layers.
We also introduce an Auxiliary Supervision Framework (ASF) that uses a pseudo-labeling and a soft-weighting strategies to alleviate the negative impact of noise.
arXiv Detail & Related papers (2023-06-04T06:42:09Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - FCOS: A simple and strong anchor-free object detector [111.87691210818194]
We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion.
Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes.
In contrast, our proposed detector FCOS is anchor box free, as well as proposal free.
arXiv Detail & Related papers (2020-06-14T01:03:39Z) - Object Instance Mining for Weakly Supervised Object Detection [24.021995037282394]
This paper introduces an end-to-end object instance mining (OIM) framework for weakly supervised object detection.
OIM attempts to detect all possible object instances existing in each image by introducing information propagation on the spatial and appearance graphs.
During the iterative learning process, the less discriminative object instances from the same class can be gradually detected and utilized for training.
arXiv Detail & Related papers (2020-02-04T02:11:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.