Related papers: A Foundation Model for DAS Signal Recognition and Visual Prompt Tuning of the Pre-trained Model for Downstream Tasks

A Foundation Model for DAS Signal Recognition and Visual Prompt Tuning of the Pre-trained Model for Downstream Tasks

URL: http://arxiv.org/abs/2508.04316v1
Date: Wed, 06 Aug 2025 11:02:25 GMT
Title: A Foundation Model for DAS Signal Recognition and Visual Prompt Tuning of the Pre-trained Model for Downstream Tasks
Authors: Kun Gui, Hongliang Ren, Shang Shi, Jin Lu, Changqiu Yu, Quanjun Cao, Guomin Gu, Qi Xuan,
Abstract summary: This study proposes a foundational model for DAS signal recognition based on a Masked Autocoder, named MAEPD.<n>The model is pretrained on a dataset of 635860 samples, encompassing DAS gait signals, 2temporal GASF images for perimeter security, 2D time-frequency images for pipeline leakage, and open-dataset signals including whale vocalizations and seismic activities.<n>The VPT-Deep approach achieves a classification accuracy of 96.94% with just 0.322% of parameters fine-tuned, surpassing the traditional Full Fine Tuning (FFT) method by 0.61% and reducing training time by
Score: 6.14430079610632
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Distributed Acoustic Sensing (DAS) technology finds growing applications across various domains. However, data distribution disparities due to heterogeneous sensing environments pose challenges for data-driven artificial intelligence (AI) models, limiting cross-domain generalization and facing a shortage of labeled training data. To address these issues, this study proposes a foundational model for DAS signal recognition based on a Masked Autoencoder, named MAEPD. The MAEPD model is pretrained on a dataset of 635,860 samples, encompassing DAS gait spatiotemporal signals, 2D GASF images for perimeter security, 2D time-frequency images for pipeline leakage, and open-dataset signals including whale vocalizations and seismic activities, using a self-supervised mask reconstruction task to capture deep semantic features of DAS signals. Visual Prompt Tuning (VPT) is employed for downstream recognition tasks. This method freezes the pretrained backbone parameters and fine-tunes only a small set of learnable visual prompt vectors inserted into the Transformer encoder layers. Experiments on the NVIDIA GeForce RTX 4080 Super platform validate MAEPD using indoor gait recognition as a downstream task. The VPT-Deep approach achieves a classification accuracy of 96.94% with just 0.322% of parameters fine-tuned, surpassing the traditional Full Fine Tuning (FFT) method by 0.61% and reducing training time by 45%. The model also exhibits robust performance in pipeline leakage detection, confirming the generality, efficiency, and scalability of MAEPD as a foundational model. This approach offers a novel paradigm for addressing the limited generalization of signal recognition models in the DAS domain.

Related papers

Masked Autoencoders for Ultrasound Signals: Robust Representation Learning for Downstream Applications [0.0]
We investigated the adaptation and performance of Masked Autoencoders (MAEs) with Vision Transformer (ViT) architectures for self-supervised representation learning on one-dimensional (1D) ultrasound signals.<n>Our results show that pre-trained models significantly outperform models trained from scratch and strong convolutional neural network (CNN) baselines optimized for the downstream task.
arXiv Detail & Related papers (2025-08-28T10:13:33Z)
Adaptive Signal Analysis for Automated Subsurface Defect Detection Using Impact Echo in Concrete Slabs [0.0]
This pilot study presents a novel, automated, and scalable methodology for detecting subsurface defect-prone regions in concrete slabs.<n>The approach integrates advanced signal processing, clustering, and visual analytics to identify subsurface anomalies.<n>The results demonstrate the robustness of the methodology, consistently identifying defect-prone areas with minimal false positives and few missed defects.
arXiv Detail & Related papers (2024-12-23T20:05:53Z)
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference. Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts [6.80671668491958]
Test-time adaptation (TTA) allows direct adaptation of a pre-trained model to unlabeled data during inference stage without access to source data or additional training. We propose three domain shift paradigms: photogrammetric to airborne LiDAR, airborne to mobile LiDAR, and synthetic to mobile laser scanning. Experimental results show our method improves classification accuracy by up to 20% mIoU, outperforming other methods.
arXiv Detail & Related papers (2024-07-08T15:40:28Z)
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal. Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths. Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z)
Object-Size-Driven Design of Convolutional Neural Networks: Virtual Axle Detection based on Raw Data [0.0]
This study presents a novel approach for real-time detection of train axles using sensors arbitrarily placed on bridges.<n>The developed Virtual Axle Detector with Enhanced Receptive Field (VADER) has been validated on a single-track railway bridge.<n>Using raw data as input outperformed the state-of-the-art spectrogram-based method in both speed and memory usage by 99%.
arXiv Detail & Related papers (2023-09-04T12:53:54Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
GaitSADA: Self-Aligned Domain Adaptation for mmWave Gait Recognition [14.750765172614836]
mmWave radar-based gait recognition is a novel user identification method that captures human gait biometrics from mmWave radar return signals. To mitigate this issue, a novel self-aligned domain adaptation method called GaitSADA is proposed. Experiments show that GaitSADA outperforms representative domain adaptation methods with an improvement ranging from 15.41% to 26.32% on average accuracy in low data regimes.
arXiv Detail & Related papers (2023-01-31T03:21:08Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
Decision Forest Based EMG Signal Classification with Low Volume Dataset Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience. We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z)
A Novel Approach For Analysis of Distributed Acoustic Sensing System Based on Deep Transfer Learning [0.0]
Convolutional neural networks are highly capable tools for extracting spatial information. Long-short term memory (LSTM) is an effective instrument for processing sequential data. VGG-16 architecture in our framework manages to obtain 100% classification accuracy in 50 trainings.
arXiv Detail & Related papers (2022-06-24T19:56:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.