Related papers: 4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification

4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification

URL: http://arxiv.org/abs/2505.23782v1
Date: Wed, 21 May 2025 22:34:07 GMT
Title: 4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification
Authors: Andrew P. Berg, Qian Zhang, Mia Y. Wang,
Abstract summary: This study investigates deep learning approaches to UAV classification focusing on the key issue of data scarcity.<n>We train the models using a total of 4,500 seconds of audio samples, evenly distributed across a 9-class dataset.<n>We compare the use of convolutional neural networks (CNNs) and attention-based transformers.
Score: 2.3354223046061016
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Unmanned aerial vehicle (UAV) usage is expected to surge in the coming decade, raising the need for heightened security measures to prevent airspace violations and security threats. This study investigates deep learning approaches to UAV classification focusing on the key issue of data scarcity. To investigate this we opted to train the models using a total of 4,500 seconds of audio samples, evenly distributed across a 9-class dataset. We leveraged parameter efficient fine-tuning (PEFT) and data augmentations to mitigate the data scarcity. This paper implements and compares the use of convolutional neural networks (CNNs) and attention-based transformers. Our results show that, CNNs outperform transformers by 1-2\% accuracy, while still being more computationally efficient. These early findings, however, point to potential in using transformers models; suggesting that with more data and further optimizations they could outperform CNNs. Future works aims to upscale the dataset to better understand the trade-offs between these approaches.

Related papers

15,500 Seconds: Lean UAV Classification Using EfficientNet and Lightweight Fine-Tuning [2.3354223046061016]
Unmanned Aerial Vehicles (UAVs) pose an escalating security concerns as the market for consumer and military UAVs grows.<n>This paper address the critical data scarcity challenges in deep UAV audio classification.
arXiv Detail & Related papers (2025-05-21T21:53:19Z)
An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data. In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models. We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z)
Designing Pre-training Datasets from Unlabeled Data for EEG Classification with Transformers [0.0]
We present a way to design several labeled datasets from unlabeled electroencephalogram (EEG) data. These can then be used to pre-train transformers to learn representations of EEG signals. We tested this method on an epileptic seizure forecasting task on the Temple University Seizure Detection Corpus.
arXiv Detail & Related papers (2024-09-23T13:26:13Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
Collaborative Learning with a Drone Orchestrator [79.75113006257872]
A swarm of intelligent wireless devices train a shared neural network model with the help of a drone. The proposed framework achieves a significant speedup in training, leading to an average 24% and 87% saving in the drone hovering time.
arXiv Detail & Related papers (2023-03-03T23:46:25Z)
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation [6.617487928813374]
We propose a training procedure for efficient CNNs based on offline Knowledge Distillation (KD) from high-performing yet complex transformers. We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of.483 mAP on AudioSet.
arXiv Detail & Related papers (2022-11-09T09:58:22Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID) Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance. This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z)
Semantic Perturbations with Normalizing Flows for Improved Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations. We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z)
Passive Batch Injection Training Technique: Boosting Network Performance by Injecting Mini-Batches from a different Data Distribution [39.8046809855363]
This work presents a novel training technique for deep neural networks that makes use of additional data from a distribution that is different from that of the original input data. To the best of our knowledge, this is the first work that makes use of different data distribution to aid the training of convolutional neural networks (CNNs)
arXiv Detail & Related papers (2020-06-08T08:17:32Z)
Radioactive data: tracing through training [130.2266320167683]
We propose a new technique, emphradioactive data, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value) Our method is robust to data augmentation and backdoority of deep network optimization.
arXiv Detail & Related papers (2020-02-03T18:41:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.