Seq-Masks: Bridging the gap between appearance and gait modeling for
video-based person re-identification
- URL: http://arxiv.org/abs/2112.05626v1
- Date: Fri, 10 Dec 2021 16:00:20 GMT
- Title: Seq-Masks: Bridging the gap between appearance and gait modeling for
video-based person re-identification
- Authors: Zhigang Chang, Zhao Yang, Yongbiao Chen, Qin Zhou, Shibao Zheng
- Abstract summary: ideo-based person re-identification (Re-ID) aims to match person images in video sequences captured by disjoint surveillance cameras.
Traditional video-based person Re-ID methods focus on exploring appearance information, thus, vulnerable against illumination changes, scene noises, camera parameters, and especially clothes/carrying variations.
We propose a framework that utilizes the sequence masks (SeqMasks) in the video to integrate appearance information and gait modeling in a close fashion.
- Score: 10.490428828061292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ideo-based person re-identification (Re-ID) aims to match person images in
video sequences captured by disjoint surveillance cameras. Traditional
video-based person Re-ID methods focus on exploring appearance information,
thus, vulnerable against illumination changes, scene noises, camera parameters,
and especially clothes/carrying variations. Gait recognition provides an
implicit biometric solution to alleviate the above headache. Nonetheless, it
experiences severe performance degeneration as camera view varies. In an
attempt to address these problems, in this paper, we propose a framework that
utilizes the sequence masks (SeqMasks) in the video to integrate appearance
information and gait modeling in a close fashion. Specifically, to sufficiently
validate the effectiveness of our method, we build a novel dataset named
MaskMARS based on MARS. Comprehensive experiments on our proposed large wild
video Re-ID dataset MaskMARS evidenced our extraordinary performance and
generalization capability. Validations on the gait recognition metric CASIA-B
dataset further demonstrated the capability of our hybrid model.
Related papers
- Pluralistic Salient Object Detection [108.74650817891984]
We introduce pluralistic salient object detection (PSOD), a novel task aimed at generating multiple plausible salient segmentation results for a given input image.
We present two new SOD datasets "DUTS-MM" and "DUS-MQ", along with newly designed evaluation metrics.
arXiv Detail & Related papers (2024-09-04T01:38:37Z) - Siamese Masked Autoencoders [76.35448665609998]
We present Siamese Masked Autoencoders (SiamMAE) for learning visual correspondence from videos.
SiamMAE operates on pairs of randomly sampled video frames and asymmetrically masks them.
It outperforms state-of-the-art self-supervised methods on video object segmentation, pose keypoint propagation, and semantic part propagation tasks.
arXiv Detail & Related papers (2023-05-23T17:59:46Z) - Disguise without Disruption: Utility-Preserving Face De-Identification [40.484745636190034]
We introduce Disguise, a novel algorithm that seamlessly de-identifies facial images while ensuring the usability of the modified data.
Our method involves extracting and substituting depicted identities with synthetic ones, generated using variational mechanisms to maximize obfuscation and non-invertibility.
We extensively evaluate our method using multiple datasets, demonstrating a higher de-identification rate and superior consistency compared to prior approaches in various downstream tasks.
arXiv Detail & Related papers (2023-03-23T13:50:46Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - MINTIME: Multi-Identity Size-Invariant Video Deepfake Detection [17.74528571088335]
We introduce MINTIME, a video deepfake detection approach that captures spatial and temporal anomalies and handles instances of multiple people in the same video and variations in face sizes.
It achieves state-of-the-art results on the ForgeryNet dataset with an improvement of up to 14% AUC in videos containing multiple people.
arXiv Detail & Related papers (2022-11-20T15:17:24Z) - Face Anti-Spoofing from the Perspective of Data Sampling [0.342658286826597]
Face presentation attack detection plays a vital role in providing secure facial access to digital devices.
Most existing video-based PAD countermeasures lack the ability to cope with long-range temporal variations in videos.
This paper proposes a video processing scheme that models the long-range temporal variations based on Gaussian Weighting Function.
arXiv Detail & Related papers (2022-08-28T07:54:30Z) - Mask-invariant Face Recognition through Template-level Knowledge
Distillation [3.727773051465455]
Masks affect the performance of previous face recognition systems.
We propose a mask-invariant face recognition solution (MaskInv)
In addition to the distilled knowledge, the student network benefits from additional guidance by margin-based identity classification loss.
arXiv Detail & Related papers (2021-12-10T16:19:28Z) - Multi-Dataset Benchmarks for Masked Identification using Contrastive
Representation Learning [0.0]
COVID-19 pandemic has drastically changed accepted norms globally.
Official documents such as passports, driving license and national identity cards are enrolled with fully uncovered face images.
In an airport or security checkpoint it is safer to match the unmasked image of the identifying document to the masked person rather than asking them to remove the mask.
We propose a contrastive visual representation learning based pre-training workflow which is specialized to masked vs unmasked face matching.
arXiv Detail & Related papers (2021-06-10T08:58:10Z) - Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face
Presentation Attack Detection [103.7264459186552]
Face presentation attack detection (PAD) is essential to secure face recognition systems.
Most existing 3D mask PAD benchmarks suffer from several drawbacks.
We introduce a largescale High-Fidelity Mask dataset to bridge the gap to real-world applications.
arXiv Detail & Related papers (2021-04-13T12:48:38Z) - Camera-aware Proxies for Unsupervised Person Re-Identification [60.26031011794513]
This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations.
We propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera.
Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model.
arXiv Detail & Related papers (2020-12-19T12:37:04Z) - Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition [86.31412529187243]
Few-shot video recognition aims at learning new actions with only very few labeled samples.
We propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net.
arXiv Detail & Related papers (2020-10-20T03:06:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.