Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step
Person Search
- URL: http://arxiv.org/abs/2302.04607v1
- Date: Thu, 9 Feb 2023 12:45:20 GMT
- Title: Deep Intra-Image Contrastive Learning for Weakly Supervised One-Step
Person Search
- Authors: Jiabei Wang and Yanwei Pang and Jiale Cao and Hanqing Sun and Zhuang
Shao and Xuelong Li
- Abstract summary: We present a novel deep intra-image contrastive learning using a Siamese network.
Our method achieves a state-of-the-art performance among weakly supervised one-step person search approaches.
- Score: 98.2559247611821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised person search aims to perform joint pedestrian detection
and re-identification (re-id) with only person bounding-box annotations.
Recently, the idea of contrastive learning is initially applied to weakly
supervised person search, where two common contrast strategies are memory-based
contrast and intra-image contrast. We argue that current intra-image contrast
is shallow, which suffers from spatial-level and occlusion-level variance. In
this paper, we present a novel deep intra-image contrastive learning using a
Siamese network. Two key modules are spatial-invariant contrast (SIC) and
occlusion-invariant contrast (OIC). SIC performs many-to-one contrasts between
two branches of Siamese network and dense prediction contrasts in one branch of
Siamese network. With these many-to-one and dense contrasts, SIC tends to learn
discriminative scale-invariant and location-invariant features to solve
spatial-level variance. OIC enhances feature consistency with the masking
strategy to learn occlusion-invariant features. Extensive experiments are
performed on two person search datasets CUHK-SYSU and PRW, respectively. Our
method achieves a state-of-the-art performance among weakly supervised one-step
person search approaches. We hope that our simple intra-image contrastive
learning can provide more paradigms on weakly supervised person search. The
source code is available at \url{https://github.com/jiabeiwangTJU/DICL}.
Related papers
- CAR: Contrast-Agnostic Deformable Medical Image Registration with Contrast-Invariant Latent Regularization [6.313081057543946]
We propose a novel contrast-agnostic deformable image registration framework that can be generalized to arbitrary contrast images.
Particularly, we propose a random convolution-based contrast augmentation scheme, which simulates arbitrary contrasts of images over a single image contrast.
Experiments show that CAR outperforms the baseline approaches regarding registration accuracy and also possesses better generalization ability unseen imaging contrasts.
arXiv Detail & Related papers (2024-08-03T19:46:23Z) - Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification [32.537029197752915]
Unsupervised visible-infrared person re-identification (USVI-ReID) aims to match specified people in infrared images to visible images without annotations, and vice versa.
Most existing methods address the USVI-ReID using cluster-based contrastive learning, which simply employs the cluster center as a representation of a person.
We propose a Progressive Contrastive Learning with Hard and Dynamic Prototypes method for USVI-ReID.
arXiv Detail & Related papers (2024-02-29T10:37:49Z) - Understanding Dark Scenes by Contrasting Multi-Modal Observations [20.665687608385625]
We introduce a supervised multi-modal contrastive learning approach to increase the semantic discriminability of the learned multi-modal feature spaces.
Cross-modal contrast encourages same-class embeddings from across the two modalities to be closer.
The intra-modal contrast forces same-class or different-class embeddings within each modality to be together or apart.
arXiv Detail & Related papers (2023-08-23T11:39:07Z) - Dense Siamese Network [86.23741104851383]
We present Dense Siamese Network (DenseSiam), a simple unsupervised learning framework for dense prediction tasks.
It learns visual representations by maximizing the similarity between two views of one image with two types of consistency, i.e., pixel consistency and region consistency.
It surpasses state-of-the-art segmentation methods by 2.1 mIoU with 28% training costs.
arXiv Detail & Related papers (2022-03-21T15:55:23Z) - Semantically Contrastive Learning for Low-light Image Enhancement [48.71522073014808]
Low-light image enhancement (LLE) remains challenging due to the unfavorable prevailing low-contrast and weak-visibility problems of single RGB images.
We propose an effective semantically contrastive learning paradigm for LLE (namely SCL-LLE)
Our method surpasses the state-of-the-arts LLE models over six independent cross-scenes datasets.
arXiv Detail & Related papers (2021-12-13T07:08:33Z) - Semantics-Guided Contrastive Network for Zero-Shot Object detection [67.61512036994458]
Zero-shot object detection (ZSD) is a new challenge in computer vision.
We develop ContrastZSD, a framework that brings contrastive learning mechanism into the realm of zero-shot detection.
Our method outperforms the previous state-of-the-art on both ZSD and generalized ZSD tasks.
arXiv Detail & Related papers (2021-09-04T03:32:15Z) - Contrastive Learning based Hybrid Networks for Long-Tailed Image
Classification [31.647639786095993]
We propose a novel hybrid network structure composed of a supervised contrastive loss to learn image representations and a cross-entropy loss to learn classifiers.
Experiments on three long-tailed classification datasets demonstrate the advantage of the proposed contrastive learning based hybrid networks in long-tailed classification.
arXiv Detail & Related papers (2021-03-26T05:22:36Z) - Rethinking of the Image Salient Object Detection: Object-level Semantic
Saliency Re-ranking First, Pixel-wise Saliency Refinement Latter [62.26677215668959]
We propose a lightweight, weakly supervised deep network to coarsely locate semantically salient regions.
We then fuse multiple off-the-shelf deep models on these semantically salient regions as the pixel-wise saliency refinement.
Our method is simple yet effective, which is the first attempt to consider the salient object detection mainly as an object-level semantic re-ranking problem.
arXiv Detail & Related papers (2020-08-10T07:12:43Z) - Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning [108.999497144296]
Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
arXiv Detail & Related papers (2020-03-11T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.