Resolution-invariant Person ReID Based on Feature Transformation and
Self-weighted Attention
- URL: http://arxiv.org/abs/2101.04544v2
- Date: Mon, 18 Jan 2021 02:50:42 GMT
- Title: Resolution-invariant Person ReID Based on Feature Transformation and
Self-weighted Attention
- Authors: Ziyue Zhang, Shuai Jiang, Congzhentao Huang, Richard Yi Da Xu
- Abstract summary: Person Re-identification (ReID) is a critical computer vision task which aims to match the same person in images or video sequences.
We propose a novel two-stream network with a lightweight resolution association ReID feature transformation (RAFT) module and a self-weighted attention (SWA) ReID module.
Both modules are jointly trained to get a resolution-invariant representation.
- Score: 14.777001614779806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Person Re-identification (ReID) is a critical computer vision task which aims
to match the same person in images or video sequences. Most current works focus
on settings where the resolution of images is kept the same. However, the
resolution is a crucial factor in person ReID, especially when the cameras are
at different distances from the person or the camera's models are different
from each other. In this paper, we propose a novel two-stream network with a
lightweight resolution association ReID feature transformation (RAFT) module
and a self-weighted attention (SWA) ReID module to evaluate features under
different resolutions. RAFT transforms the low resolution features to
corresponding high resolution features. SWA evaluates both features to get
weight factors for the person ReID. Both modules are jointly trained to get a
resolution-invariant representation. Extensive experiments on five benchmark
datasets show the effectiveness of our method. For instance, we achieve Rank-1
accuracy of 43.3% and 83.2% on CAVIAR and MLR-CUHK03, outperforming the
state-of-the-art.
Related papers
- Multi-task Image Restoration Guided By Robust DINO Features [88.74005987908443]
We propose mboxtextbfDINO-IR, a multi-task image restoration approach leveraging robust features extracted from DINOv2.
We first propose a pixel-semantic fusion (PSF) module to dynamically fuse DINOV2's shallow features.
By formulating these modules into a unified deep model, we propose a DINO perception contrastive loss to constrain the model training.
arXiv Detail & Related papers (2023-12-04T06:59:55Z) - ResFormer: Scaling ViTs with Multi-Resolution Training [100.01406895070693]
We introduce ResFormer, a framework for improved performance on a wide spectrum of, mostly unseen, testing resolutions.
In particular, ResFormer operates on replicated images of different resolutions and enforces a scale consistency loss to engage interactive information across different scales.
We demonstrate, moreover, ResFormer is flexible and can be easily extended to semantic segmentation, object detection and video action recognition.
arXiv Detail & Related papers (2022-12-01T18:57:20Z) - Learning Resolution-Adaptive Representations for Cross-Resolution Person
Re-Identification [49.57112924976762]
Cross-resolution person re-identification problem aims to match low-resolution (LR) query identity images against high resolution (HR) gallery images.
It is a challenging and practical problem since the query images often suffer from resolution degradation due to the different capturing conditions from real-world cameras.
This paper explores an alternative SR-free paradigm to directly compare HR and LR images via a dynamic metric, which is adaptive to the resolution of a query image.
arXiv Detail & Related papers (2022-07-09T03:49:51Z) - Resolution based Feature Distillation for Cross Resolution Person
Re-Identification [17.86505685442293]
Person re-identification (re-id) aims to retrieve images of same identities across different camera views.
Resolution mismatch occurs due to varying distances between person of interest and cameras.
We propose a Resolution based Feature Distillation (RFD) approach to overcome the problem of multiple resolutions.
arXiv Detail & Related papers (2021-09-16T11:07:59Z) - Robust Reference-based Super-Resolution via C2-Matching [77.51610726936657]
Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image by introducing an additional high-resolution (HR) reference image.
Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images.
We propose C2-Matching, which produces explicit robust matching crossing transformation and resolution.
arXiv Detail & Related papers (2021-06-03T16:40:36Z) - Deep High-Resolution Representation Learning for Cross-Resolution Person
Re-identification [22.104449922937338]
Person re-identification (re-ID) tackles the problem of matching person images with the same identity from different cameras.
We propose a Deep High-Resolution Pseudo-Siamese Framework (PS-HRNet) to solve the problem.
Our proposed PS-HRNet improves 3.4%, 6.2%, 2.5%,1.1% and 4.2% at Rank-1 on MLR-Market-1501, MLR-CUHK03, MLR-VIPeR, MLR-DukeMTMC-reID, and CAVIAR datasets.
arXiv Detail & Related papers (2021-05-25T07:45:38Z) - Pose Invariant Person Re-Identification using Robust Pose-transformation
GAN [11.338815177557645]
Person re-identification (re-ID) aims to retrieve a person's images from an image gallery, given a single instance of the person of interest.
Despite several advancements, learning discriminative identity-sensitive and viewpoint invariant features for robust Person Re-identification is a major challenge owing to large pose variation of humans.
This paper proposes a re-ID pipeline that utilizes the image generation capability of Generative Adversarial Networks combined with pose regression and feature fusion to achieve pose invariant feature learning.
arXiv Detail & Related papers (2021-04-11T15:47:03Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z) - Cross-Resolution Adversarial Dual Network for Person Re-Identification
and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views.
Due to varying distances between cameras and persons of interest, resolution mismatch can be expected.
We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.