Improving Crowded Object Detection via Copy-Paste
- URL: http://arxiv.org/abs/2211.12110v1
- Date: Tue, 22 Nov 2022 09:25:15 GMT
- Title: Improving Crowded Object Detection via Copy-Paste
- Authors: Jiangfan Deng, Dewen Fan, Xiaosong Qiu, Feng Zhou
- Abstract summary: Crowdedness caused by overlapping among similar objects is a ubiquitous challenge in the field of 2D visual object detection.
We first underline two main effects of the crowdedness issue: 1) IoU-confidence correlation disturbances (ICD) and 2) confused de-duplication (CDD)
- Score: 6.941267349187447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdedness caused by overlapping among similar objects is a ubiquitous
challenge in the field of 2D visual object detection. In this paper, we first
underline two main effects of the crowdedness issue: 1) IoU-confidence
correlation disturbances (ICD) and 2) confused de-duplication (CDD). Then we
explore a pathway of cracking these nuts from the perspective of data
augmentation. Primarily, a particular copy-paste scheme is proposed towards
making crowded scenes. Based on this operation, we first design a "consensus
learning" method to further resist the ICD problem and then find out the
pasting process naturally reveals a pseudo "depth" of object in the scene,
which can be potentially used for alleviating CDD dilemma. Both methods are
derived from magical using of the copy-pasting without extra cost for
hand-labeling. Experiments show that our approach can easily improve the
state-of-the-art detector in typical crowded detection task by more than 2%
without any bells and whistles. Moreover, this work can outperform existing
data augmentation strategies in crowded scenario.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - A Dense Reward View on Aligning Text-to-Image Diffusion with Preference [54.43177605637759]
We propose a tractable alignment objective that emphasizes the initial steps of the T2I reverse chain.
In experiments on single and multiple prompt generation, our method is competitive with strong relevant baselines.
arXiv Detail & Related papers (2024-02-13T07:37:24Z) - Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object
Detection [55.210991151015534]
We present a novel Dual-Perspective Knowledge Enrichment approach named DPKE for semi-supervised 3D object detection.
Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.
arXiv Detail & Related papers (2024-01-10T08:56:07Z) - Occlusion-Aware Detection and Re-ID Calibrated Network for Multi-Object
Tracking [38.36872739816151]
Occlusion-Aware Attention (OAA) module in the detector highlights the object features while suppressing the occluded background regions.
OAA can serve as a modulator that enhances the detector for some potentially occluded objects.
We design a Re-ID embedding matching block based on the optimal transport problem.
arXiv Detail & Related papers (2023-08-30T06:56:53Z) - Discrepancy-Guided Reconstruction Learning for Image Forgery Detection [10.221066530624373]
We first propose a Discrepancy-Guided (DisGE) to extract forgery-sensitive visual patterns.
We then introduce a Double-Head Reconstruction (DouHR) module to enhance genuine compact visual patterns in different granular spaces.
Under DouHR, we further introduce a Discrepancy-Aggregation Detector (DisAD) to aggregate these genuine compact visual patterns.
arXiv Detail & Related papers (2023-04-26T07:40:43Z) - DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion
Probabilistic Model [25.223801390996435]
This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection.
We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector.
We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets.
arXiv Detail & Related papers (2022-12-06T07:22:20Z) - Revisiting Consistency Regularization for Semi-supervised Change
Detection in Remote Sensing Images [60.89777029184023]
We propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss.
Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD.
arXiv Detail & Related papers (2022-04-18T17:59:01Z) - Move to See Better: Self-Improving Embodied Object Detection [35.461141354989714]
We propose a method for improving object detection in testing environments.
Our agent collects multi-view data, generates 2D and 3D pseudo-labels, and fine-tunes its detector in a self-supervised manner.
arXiv Detail & Related papers (2020-11-30T19:16:51Z) - Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing [61.82466976737915]
Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing.
We propose a new approach to detect presentation attacks from multiple frames based on two insights.
The proposed approach achieves state-of-the-art results on five benchmark datasets.
arXiv Detail & Related papers (2020-03-18T06:11:20Z) - Progressive Object Transfer Detection [84.48927705173494]
We propose a novel Progressive Object Transfer Detection (POTD) framework.
First, POTD can leverage various object supervision of different domains effectively into a progressive detection procedure.
Second, POTD consists of two delicate transfer stages, i.e., Low-Shot Transfer Detection (LSTD), and Weakly-Supervised Transfer Detection (WSTD)
arXiv Detail & Related papers (2020-02-12T00:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.