Transformer-based Multi-Instance Learning for Weakly Supervised Object
Detection
- URL: http://arxiv.org/abs/2303.14999v1
- Date: Mon, 27 Mar 2023 08:42:45 GMT
- Title: Transformer-based Multi-Instance Learning for Weakly Supervised Object
Detection
- Authors: Zhaofei Wang, Weijia Zhang, Min-Ling Zhang
- Abstract summary: Weakly Supervised Object Detection (WSOD) enables the training of object detection models using only image-level annotations.
We propose a novel backbone for WSOD based on our tailored Vision Transformer named Weakly Supervised Transformer Detection Network (WSTDN)
- Score: 43.481591776038144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly Supervised Object Detection (WSOD) enables the training of object
detection models using only image-level annotations. State-of-the-art WSOD
detectors commonly rely on multi-instance learning (MIL) as the backbone of
their detectors and assume that the bounding box proposals of an image are
independent of each other. However, since such approaches only utilize the
highest score proposal and discard the potentially useful information from
other proposals, their independent MIL backbone often limits models to salient
parts of an object or causes them to detect only one object per class. To solve
the above problems, we propose a novel backbone for WSOD based on our tailored
Vision Transformer named Weakly Supervised Transformer Detection Network
(WSTDN). Our algorithm is not only the first to demonstrate that self-attention
modules that consider inter-instance relationships are effective backbones for
WSOD, but also we introduce a novel bounding box mining method (BBM) integrated
with a memory transfer refinement (MTR) procedure to utilize the instance
dependencies for facilitating instance refinements. Experimental results on
PASCAL VOC2007 and VOC2012 benchmarks demonstrate the effectiveness of our
proposed WSTDN and modified instance refinement modules.
Related papers
- Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection [12.417754433715903]
We introduce Sparse Semi-DETR, a novel transformer-based, end-to-end semi-supervised object detection solution.
Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries, significantly improving detection capabilities for small and partially obscured objects.
On the MS-COCO and Pascal VOC object detection benchmarks, Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods.
arXiv Detail & Related papers (2024-04-02T10:22:23Z) - Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector [72.05791402494727]
This paper studies the challenging cross-domain few-shot object detection (CD-FSOD)
It aims to develop an accurate object detector for novel domains with minimal labeled examples.
arXiv Detail & Related papers (2024-02-05T15:25:32Z) - Occlusion-Aware Detection and Re-ID Calibrated Network for Multi-Object
Tracking [38.36872739816151]
Occlusion-Aware Attention (OAA) module in the detector highlights the object features while suppressing the occluded background regions.
OAA can serve as a modulator that enhances the detector for some potentially occluded objects.
We design a Re-ID embedding matching block based on the optimal transport problem.
arXiv Detail & Related papers (2023-08-30T06:56:53Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Scaling Novel Object Detection with Weakly Supervised Detection
Transformers [21.219817483091166]
We propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning.
Our experiments show that our approach outperforms previous state-of-the-art models on large-scale novel object detection datasets.
arXiv Detail & Related papers (2022-07-11T21:45:54Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Discovery-and-Selection: Towards Optimal Multiple Instance Learning for
Weakly Supervised Object Detection [86.86602297364826]
We propose a discoveryand-selection approach fused with multiple instance learning (DS-MIL)
Our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.
arXiv Detail & Related papers (2021-10-18T07:06:57Z) - Object Detection Made Simpler by Eliminating Heuristic NMS [70.93004137521946]
We show a simple NMS-free, end-to-end object detection framework.
We attain on par or even improved detection accuracy compared with the original one-stage detector.
arXiv Detail & Related papers (2021-01-28T02:38:29Z) - Distilling Knowledge from Refinement in Multiple Instance Detection
Networks [0.0]
Weakly supervised object detection (WSOD) aims to tackle the object detection problem using only labeled image categories as supervision.
We present an adaptive supervision aggregation function that dynamically changes the aggregation criteria for selecting boxes related to one of the ground-truth classes, background, or even ignored during the generation of each refinement module supervision.
arXiv Detail & Related papers (2020-04-23T02:49:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.