Related papers: Spectrum-guided Feature Enhancement Network for Event Person Re-Identification

Spectrum-guided Feature Enhancement Network for Event Person Re-Identification

URL: http://arxiv.org/abs/2402.01269v2
Date: Mon, 23 Dec 2024 02:50:27 GMT
Title: Spectrum-guided Feature Enhancement Network for Event Person Re-Identification
Authors: Hongchen Tan, Yi Zhang, Xiuping Liu, Baocai Yin, Nan Ma, Xin Li, Huchuan Lu,
Abstract summary: We introduce the Spectrum-guided Feature Enhancement Network (SFE-Net)<n>The SFE-Net consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM)
Score: 78.58871881610165
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As a cutting-edge biosensor, the event camera holds significant potential in the field of computer vision, particularly regarding privacy preservation. However, compared to traditional cameras, event streams often contain noise and possess extremely sparse semantics, posing a formidable challenge for event-based person re-identification (event Re-ID). To address this, we introduce a novel event person re-identification network: the Spectrum-guided Feature Enhancement Network (SFE-Net). This network consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM). MSAM employs a fourier spectrum transform strategy to filter event noise, while also utilizing an event-guided multi-granularity attention strategy to enhance and capture discriminative person semantics. CPDM employs a consecutive patch dropout strategy to generate multiple incomplete feature maps, encouraging the deep Re-ID model to equally perceive each effective region of the person's body and capture robust person descriptors. Extensive experiments on Event Re-ID datasets demonstrate that our SFE-Net achieves the best performance in this task.

Related papers

EIFNet: Leveraging Event-Image Fusion for Robust Semantic Segmentation [0.18416014644193066]
Event cameras offer high dynamic range and fine temporal resolution, to achieve robust scene understanding in challenging environments.<n>We propose EIFNet, a multi-modal fusion network that combines the strengths of both event and frame-based inputs.<n>EIFNet achieves state-of-the-art performance, demonstrating its effectiveness in event-based semantic segmentation.
arXiv Detail & Related papers (2025-07-29T16:19:55Z)
Multi-modal Multi-platform Person Re-Identification: Benchmark and Method [58.59888754340054]
MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID. This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging. We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
arXiv Detail & Related papers (2025-03-21T12:27:49Z)
FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system. We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain. To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB) We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z)
Cross-Modal Synergies: Unveiling the Potential of Motion-Aware Fusion Networks in Handling Dynamic and Static ReID Scenarios [4.635813517641097]
We introduce an innovative Motion-Aware Fusion (MOTAR-FUSE) network that utilizes motion cues derived from static imagery to significantly enhance ReID capabilities. A unique aspect of our approach is the integration of a motion consistency task, which empowers the motion-aware transformer to adeptly capture the dynamics of human motion.
arXiv Detail & Related papers (2025-02-02T04:37:25Z)
Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks. MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization. MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
ViFi-ReID: A Two-Stream Vision-WiFi Multimodal Approach for Person Re-identification [3.3743041904085125]
Person re-identification (ReID) plays a vital role in safety inspections, personnel counting, and more. Most current ReID approaches primarily extract features from images, which are easily affected by objective conditions. We leverage widely available routers as sensing devices by capturing gait information from pedestrians through the Channel State Information (CSI) in WiFi signals.
arXiv Detail & Related papers (2024-10-13T15:34:11Z)
PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification [73.64560354556498]
Vision Transformer (ViT) tends to overfit on most distinct regions of training data, limiting its generalizability and attention to holistic object features. We present PartFormer, an innovative adaptation of ViT designed to overcome the limitations in object Re-ID tasks. Our framework significantly outperforms state-of-the-art by 2.4% mAP scores on the most challenging MSMT17 dataset.
arXiv Detail & Related papers (2024-08-29T16:31:05Z)
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking [50.26836546224782]
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy. The diversity and abruptness of eye movement patterns, including blinking, fixating, saccades, and smooth pursuit, pose significant challenges for eye localization. This paper proposes a bidirectional long-term sequence modeling and time-varying state selection mechanism to fully utilize contextual temporal information.
arXiv Detail & Related papers (2024-04-18T11:09:25Z)
SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker. SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z)
HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection. Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies. Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z)
Feature Disentanglement Learning with Switching and Aggregation for Video-based Person Re-Identification [9.068045610800667]
In video person re-identification (Re-ID), the network must consistently extract features of the target person from successive frames. Existing methods tend to focus only on how to use temporal information, which often leads to networks being fooled by similar appearances and same backgrounds. We propose a Disentanglement and Switching and Aggregation Network (DSANet), which segregates the features representing identity and features based on camera characteristics, and pays more attention to ID information.
arXiv Detail & Related papers (2022-12-16T04:27:56Z)
Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part. We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge. Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z)
Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection. Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z)
Semantic Consistency and Identity Mapping Multi-Component Generative Adversarial Network for Person Re-Identification [39.605062525247135]
We propose a semantic consistency and identity mapping multi-component generative adversarial network (SC-IMGAN) which provides style adaptation from one to many domains. Our proposed method outperforms state-of-the-art techniques on six challenging person Re-ID datasets.
arXiv Detail & Related papers (2021-04-28T14:12:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.