Related papers: Temporal Micro-Doppler Spectrogram-based ViT Multiclass Target Classification

Temporal Micro-Doppler Spectrogram-based ViT Multiclass Target Classification

URL: http://arxiv.org/abs/2511.11951v1
Date: Fri, 14 Nov 2025 23:53:33 GMT
Title: Temporal Micro-Doppler Spectrogram-based ViT Multiclass Target Classification
Authors: Nghia Thinh Nguyen, Tri Nhu Do,
Abstract summary: We propose a new Temporal MDS-Vision Transformer (T-MDS-T) for multiclass target classification using millimeter-wave FMCW radar spectrograms.<n>Specifically, we design a transformer-based architecture that processes stacked range-velocity tensors via patch embeddings and cross-axis attention mechanisms.<n>Our proposed framework is superior to existing CNN-based methods in terms of classification accuracy while achieving better data efficiency and real-time deployability.
Score: 2.165723322157105
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this paper, we propose a new Temporal MDS-Vision Transformer (T-MDS-ViT) for multiclass target classification using millimeter-wave FMCW radar micro-Doppler spectrograms. Specifically, we design a transformer-based architecture that processes stacked range-velocity-angle (RVA) spatiotemporal tensors via patch embeddings and cross-axis attention mechanisms to explicitly model the sequential nature of MDS data across multiple frames. The T-MDS-ViT exploits mobility-aware constraints in its attention layer correspondences to maintain separability under target overlaps and partial occlusions. Next, we apply an explainable mechanism to examine how the attention layers focus on characteristic high-energy regions of the MDS representations and their effect on class-specific kinematic features. We also demonstrate that our proposed framework is superior to existing CNN-based methods in terms of classification accuracy while achieving better data efficiency and real-time deployability.

Related papers

FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z)
DAMS:Dual-Branch Adaptive Multiscale Spatiotemporal Framework for Video Anomaly Detection [7.117824587276951]
This study offers a dual-path architecture called the Dual-Branch Adaptive Multiscale Stemporal Framework (DAMS), which is based on multilevel feature and decoupling fusion.<n>The main processing path integrates the Adaptive Multiscale Time Pyramid Network (AMTPN) with the Convolutional Block Attention Mechanism (CBAM)
arXiv Detail & Related papers (2025-07-28T08:42:00Z)
Dual-Domain Masked Image Modeling: A Self-Supervised Pretraining Strategy Using Spatial and Frequency Domain Masking for Hyperspectral Data [35.34526230299484]
We propose a self-supervised pretraining strategy for hyperspectral data that utilizes the large portion of unlabeled data.<n>Our method introduces a novel dual-domain masking mechanism that operates in both spatial and frequency domains.<n>We evaluate our approach on three publicly available HSI classification benchmarks and demonstrate that it achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-05-06T06:24:21Z)
FreSca: Scaling in Frequency Space Enhances Diffusion Models [55.75504192166779]
This paper explores frequency-based control within latent diffusion models.<n>We introduce FreSca, a novel framework that decomposes noise difference into low- and high-frequency components.<n>FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control.
arXiv Detail & Related papers (2025-04-02T22:03:11Z)
Spatial-Spectral Diffusion Contrastive Representation Network for Hyperspectral Image Classification [8.600534616819333]
This paper presents a Spatial-Spectral Diffusion Contrastive Representation Network (DiffCRN)<n>DiffCRN is based on denoising diffusion probabilistic model (DDPM) combined with contrastive learning (CL) for hyperspectral images classification.<n> Experiments conducted on widely used four HSI datasets demonstrate the improved performance of the proposed DiffCRN.
arXiv Detail & Related papers (2025-02-27T02:34:23Z)
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition [50.064502884594376]
We study the problem of human action recognition using motion capture (MoCap) sequences. We propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences. The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models.
arXiv Detail & Related papers (2023-03-31T16:19:27Z)
ViTs for SITS: Vision Transformers for Satellite Image Time Series [52.012084080257544]
We introduce a fully-attentional model for general Satellite Image Time Series (SITS) processing based on the Vision Transformer (ViT) TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder.
arXiv Detail & Related papers (2023-01-12T11:33:07Z)
SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE) To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z)
Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction [127.20208645280438]
Hyperspectral image (HSI) reconstruction aims to recover the 3D spatial-spectral signal from a 2D measurement. Modeling the inter-spectra interactions is beneficial for HSI reconstruction. Mask-guided Spectral-wise Transformer (MST) proposes a novel framework for HSI reconstruction.
arXiv Detail & Related papers (2021-11-15T16:59:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.