A Tri-Layer Plugin to Improve Occluded Detection
- URL: http://arxiv.org/abs/2210.10046v1
- Date: Tue, 18 Oct 2022 17:59:51 GMT
- Title: A Tri-Layer Plugin to Improve Occluded Detection
- Authors: Guanqi Zhan, Weidi Xie, Andrew Zisserman
- Abstract summary: We propose a simple '' module for the detection head of two-stage object detectors to improve the recall of partially occluded objects.
The module predicts a tri-layer of segmentation masks for the target object, the occluder and the occludee, and by doing so is able to better predict the mask of the target object.
We also establish a COCO evaluation dataset to measure the recall performance of partially occluded and separated objects.
- Score: 100.99802831241583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting occluded objects still remains a challenge for state-of-the-art
object detectors. The objective of this work is to improve the detection for
such objects, and thereby improve the overall performance of a modern object
detector.
To this end we make the following four contributions: (1) We propose a simple
'plugin' module for the detection head of two-stage object detectors to improve
the recall of partially occluded objects. The module predicts a tri-layer of
segmentation masks for the target object, the occluder and the occludee, and by
doing so is able to better predict the mask of the target object. (2) We
propose a scalable pipeline for generating training data for the module by
using amodal completion of existing object detection and instance segmentation
training datasets to establish occlusion relationships. (3) We also establish a
COCO evaluation dataset to measure the recall performance of partially occluded
and separated objects. (4) We show that the plugin module inserted into a
two-stage detector can boost the performance significantly, by only fine-tuning
the detection head, and with additional improvements if the entire architecture
is fine-tuned. COCO results are reported for Mask R-CNN with Swin-T or Swin-S
backbones, and Cascade Mask R-CNN with a Swin-B backbone.
Related papers
- Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection [12.417754433715903]
We introduce Sparse Semi-DETR, a novel transformer-based, end-to-end semi-supervised object detection solution.
Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries, significantly improving detection capabilities for small and partially obscured objects.
On the MS-COCO and Pascal VOC object detection benchmarks, Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods.
arXiv Detail & Related papers (2024-04-02T10:22:23Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR
Point Clouds [16.612824810651897]
We propose a sparsely supervised collaborative 3D object detection framework SSC3OD.
It only requires each agent to randomly label one object in the scene.
It can effectively improve the performance of sparsely supervised collaborative 3D object detectors.
arXiv Detail & Related papers (2023-07-03T02:42:14Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - VIN: Voxel-based Implicit Network for Joint 3D Object Detection and
Segmentation for Lidars [12.343333815270402]
A unified neural network structure is presented for joint 3D object detection and point cloud segmentation.
We leverage rich supervision from both detection and segmentation labels rather than using just one of them.
arXiv Detail & Related papers (2021-07-07T02:16:20Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z) - Transformer-Encoder Detector Module: Using Context to Improve Robustness
to Adversarial Attacks on Object Detection [12.521662223741673]
This article proposes a new context module that can be applied to an object detector to improve the labeling of object instances.
The proposed model achieves higher mAP, F1 scores and AUC average score of up to 13% compared to the baseline Faster-RCNN detector.
arXiv Detail & Related papers (2020-11-13T15:52:53Z) - Attention-based Joint Detection of Object and Semantic Part [4.389917490809522]
Our model is created on top of two Faster-RCNN models that share their features to get enhanced representations of both.
Experiments on the PASCAL-Part 2010 dataset show that joint detection can simultaneously improve both object detection and part detection.
arXiv Detail & Related papers (2020-07-05T18:54:10Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.