Investigation of Time-Frequency Feature Combinations with Histogram Layer Time Delay Neural Networks
- URL: http://arxiv.org/abs/2409.13881v1
- Date: Fri, 20 Sep 2024 20:22:24 GMT
- Title: Investigation of Time-Frequency Feature Combinations with Histogram Layer Time Delay Neural Networks
- Authors: Amirmohammad Mohammadi, Iren'e Masabarakiza, Ethan Barnes, Davelle Carreiro, Alexandra Van Dine, Joshua Peeples,
- Abstract summary: Methods by which audio signals are converted into time-frequency representations can significantly impact performance.
This work demonstrates the performance impact of using different combinations of time-frequency features in a histogram layer time delay neural network.
- Score: 37.95478822883338
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While deep learning has reduced the prevalence of manual feature extraction, transformation of data via feature engineering remains essential for improving model performance, particularly for underwater acoustic signals. The methods by which audio signals are converted into time-frequency representations and the subsequent handling of these spectrograms can significantly impact performance. This work demonstrates the performance impact of using different combinations of time-frequency features in a histogram layer time delay neural network. An optimal set of features is identified with results indicating that specific feature combinations outperform single data features.
Related papers
- Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Time Scale Network: A Shallow Neural Network For Time Series Data [18.46091267922322]
Time series data is often composed of information at multiple time scales.
Deep learning strategies exist to capture this information, but many make networks larger, require more data, are more demanding to compute, and are difficult to interpret.
We present a minimal, computationally efficient Time Scale Network combining the translation and dilation sequence used in discrete wavelet transforms with traditional convolutional neural networks and back-propagation.
arXiv Detail & Related papers (2023-11-10T16:39:55Z) - A Distance Correlation-Based Approach to Characterize the Effectiveness of Recurrent Neural Networks for Time Series Forecasting [1.9950682531209158]
We provide an approach to link time series characteristics with RNN components via the versatile metric of distance correlation.
We empirically show that the RNN activation layers learn the lag structures of time series well.
We also show that the activation layers cannot adequately model moving average and heteroskedastic time series processes.
arXiv Detail & Related papers (2023-07-28T22:32:08Z) - Histogram Layer Time Delay Neural Networks for Passive Sonar
Classification [58.720142291102135]
A novel method combines a time delay neural network and histogram layer to incorporate statistical contexts for improved feature learning and underwater acoustic target classification.
The proposed method outperforms the baseline model, demonstrating the utility in incorporating statistical contexts for passive sonar target recognition.
arXiv Detail & Related papers (2023-07-25T19:47:26Z) - Time-space-frequency feature Fusion for 3-channel motor imagery
classification [0.0]
This study introduces TSFF-Net, a novel network architecture that integrates time-space-frequency features.
TSFF-Net comprises four main components: time-frequency representation, time-frequency feature extraction, time-space feature extraction, and feature fusion and classification.
Experimental results demonstrate that TSFF-Net not only compensates for the shortcomings of single-mode feature extraction networks in EEG decoding, but also outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2023-04-04T02:01:48Z) - Learnable Wavelet Packet Transform for Data-Adapted Spectrograms [0.0]
We propose a framework for learnable wavelet packet transforms, enabling to learn features automatically from data.
The learned features can be represented as a spectrogram, containing the important time-frequency information of the dataset.
We evaluate the properties and performance of the proposed approach by evaluating its improved spectral leakage and by applying it to an anomaly detection task for acoustic monitoring.
arXiv Detail & Related papers (2022-01-26T17:28:17Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Benchmarking Deep Learning Interpretability in Time Series Predictions [41.13847656750174]
Saliency methods are used extensively to highlight the importance of input features in model predictions.
We set out to extensively compare the performance of various saliency-based interpretability methods across diverse neural architectures.
arXiv Detail & Related papers (2020-10-26T22:07:53Z) - Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment.
We implement this algorithm in a real-time robotic system with a microphone array.
The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.