Audio Signal Processing Using Time Domain Mel-Frequency Wavelet Coefficient
- URL: http://arxiv.org/abs/2510.24519v2
- Date: Thu, 30 Oct 2025 15:42:34 GMT
- Title: Audio Signal Processing Using Time Domain Mel-Frequency Wavelet Coefficient
- Authors: Rinku Sebastian, Simon O'Keefe, Martin Trefzer,
- Abstract summary: We propose a method to extract Mel scale features in time domain combining the concept of wavelet transform.<n>Our proposed Time domain Mel frequency Wavelet Coefficient(TMFWC) technique with the reservoir computing methodology has significantly improved the efficiency of audio signal processing.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extracting features from the speech is the most critical process in speech signal processing. Mel Frequency Cepstral Coefficients (MFCC) are the most widely used features in the majority of the speaker and speech recognition applications, as the filtering in this feature is similar to the filtering taking place in the human ear. But the main drawback of this feature is that it provides only the frequency information of the signal but does not provide the information about at what time which frequency is present. The wavelet transform, with its flexible time-frequency window, provides time and frequency information of the signal and is an appropriate tool for the analysis of non-stationary signals like speech. On the other hand, because of its uniform frequency scaling, a typical wavelet transform may be less effective in analysing speech signals, have poorer frequency resolution in low frequencies, and be less in line with human auditory perception. Hence, it is necessary to develop a feature that incorporates the merits of both MFCC and wavelet transform. A great deal of studies are trying to combine both these features. The present Wavelet Transform based Mel-scaled feature extraction methods require more computation when a wavelet transform is applied on top of Mel-scale filtering, since it adds extra processing steps. Here we are proposing a method to extract Mel scale features in time domain combining the concept of wavelet transform, thus reducing the computational burden of time-frequency conversion and the complexity of wavelet extraction. Combining our proposed Time domain Mel frequency Wavelet Coefficient(TMFWC) technique with the reservoir computing methodology has significantly improved the efficiency of audio signal processing.
Related papers
- Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations [0.0]
A challenge in marine bioacoustic analysis is the detection of animal signals, like calls, whistles and clicks, for behavioral studies.<n>This thesis shows the efficacy of CLICK-SPOT on Norwegian Killer whale underwater recordings provided by the cetacean biologist Dr. Vester.
arXiv Detail & Related papers (2026-02-19T15:50:46Z) - AWEMixer: Adaptive Wavelet-Enhanced Mixer Network for Long-Term Time Series Forecasting [12.450099337354017]
We propose AWEMixer, an Adaptive Wavelet-Enhanced Mixer Network.<n>A Frequency Router designs to utilize the global periodicity pattern achieved by Fast Fourier Transform to adaptively weight localized wavelet subband.<n>A Coherent Gated Fusion Block to achieve selective integration of prominent frequency features with multi-scale temporal representation.
arXiv Detail & Related papers (2025-11-06T11:27:12Z) - Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z) - Quantum Meets SAR: A Novel Range-Doppler Algorithm for Next-Gen Earth Observation [0.0]
This paper presents a Quantum Range Doppler Algorithm (QRDA) to accelerate processing compared to the classical FFT.<n>It introduces a quantum implementation of the Range Cell Migration Correction (RCMC) in the Fourier domain, a critical step in the RDA pipeline.<n>The performance of the quantum RCMC is evaluated and compared against its classical counterpart, demonstrating the potential of quantum computing in advanced SAR imaging.
arXiv Detail & Related papers (2025-04-02T15:40:12Z) - FLEXtime: Filterbank learning to explain time series [10.706092195673257]
State-of-the-art methods for explaining predictions from time series involve learning an instance-wise saliency mask for each time step.<n>We propose to view time series explainability as saliency maps over interpretable parts, leaning on established signal processing methodology on signal decomposition.<n>Specifically, we propose a new method called FLEXtime that uses a bank of bandpass filters to split the time series into frequency bands.
arXiv Detail & Related papers (2024-11-06T15:06:42Z) - Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining [71.46369218331215]
Image deraining aims to remove rain streaks from rainy images and restore clear backgrounds.
We propose a new framework termed FourierMamba, which performs image deraining with Mamba in the Fourier space.
arXiv Detail & Related papers (2024-05-29T18:58:59Z) - WaveDH: Wavelet Sub-bands Guided ConvNet for Efficient Image Dehazing [20.094839751816806]
We introduce WaveDH, a novel and compact ConvNet designed to address this efficiency gap in image dehazing.<n>Our WaveDH leverages wavelet sub-bands for guided up-and-downsampling and frequency-aware feature refinement.<n>Our method, WaveDH, outperforms many state-of-the-art methods on several image dehazing benchmarks with significantly reduced computational costs.
arXiv Detail & Related papers (2024-04-02T02:52:05Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - MultiWave: Multiresolution Deep Architectures through Wavelet
Decomposition for Multivariate Time Series Prediction [6.980076213134384]
MultiWave is a novel framework that enhances deep learning time series models by incorporating components that operate at the intrinsic frequencies of signals.
We show that MultiWave consistently identifies critical features and their frequency components, thus providing valuable insights into the applications studied.
arXiv Detail & Related papers (2023-06-16T20:07:15Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Multi-Scale Wavelet Transformer for Face Forgery Detection [43.33712402517951]
We propose a multi-scale wavelet transformer framework for face forgery detection.
Frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces.
Cross-modality attention is proposed to fuse the frequency features with the spatial features.
arXiv Detail & Related papers (2022-10-08T03:39:36Z) - WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence
Learning Ability [31.791279777902957]
Recent works show that learning attention in the Fourier space can improve the long sequence learning capability of Transformers.
We argue that wavelet transform shall be a better choice because it captures both position and frequency information with linear time complexity.
We propose Wavelet Space Attention (WavSpA) that facilitates attention learning in a learnable wavelet coefficient space.
arXiv Detail & Related papers (2022-10-05T02:37:59Z) - Frequency-bin entanglement from domain-engineered down-conversion [101.18253437732933]
We present a single-pass source of discrete frequency-bin entanglement which does not use filtering or a resonant cavity.
We use a domain-engineered nonlinear crystal to generate an eight-mode frequency-bin entangled source at telecommunication wavelengths.
arXiv Detail & Related papers (2022-01-18T19:00:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.