Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking
in Speech Signals
- URL: http://arxiv.org/abs/2308.16540v1
- Date: Thu, 31 Aug 2023 08:30:20 GMT
- Title: Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking
in Speech Signals
- Authors: Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku
- Abstract summary: We propose a new method for the accurate estimation and tracking of formants in speech signals.
TVQCP analysis combines three approaches to improve formant estimation and tracking.
The proposed TVQCP method performs better than conventional and popular formant tracking tools.
- Score: 17.69029813982043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a new method for the accurate estimation and
tracking of formants in speech signals using time-varying quasi-closed-phase
(TVQCP) analysis. Conventional formant tracking methods typically adopt a
two-stage estimate-and-track strategy wherein an initial set of formant
candidates are estimated using short-time analysis (e.g., 10--50 ms), followed
by a tracking stage based on dynamic programming or a linear state-space model.
One of the main disadvantages of these approaches is that the tracking stage,
however good it may be, cannot improve upon the formant estimation accuracy of
the first stage. The proposed TVQCP method provides a single-stage formant
tracking that combines the estimation and tracking stages into one. TVQCP
analysis combines three approaches to improve formant estimation and tracking:
(1) it uses temporally weighted quasi-closed-phase analysis to derive
closed-phase estimates of the vocal tract with reduced interference from the
excitation source, (2) it increases the residual sparsity by using the $L_1$
optimization and (3) it uses time-varying linear prediction analysis over long
time windows (e.g., 100--200 ms) to impose a continuity constraint on the vocal
tract model and hence on the formant trajectories. Formant tracking experiments
with a wide variety of synthetic and natural speech signals show that the
proposed TVQCP method performs better than conventional and popular formant
tracking tools, such as Wavesurfer and Praat (based on dynamic programming),
the KARMA algorithm (based on Kalman filtering), and DeepFormants (based on
deep neural networks trained in a supervised manner). Matlab scripts for the
proposed method can be found at: https://github.com/njaygowda/ftrack
Related papers
- Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Refining a Deep Learning-based Formant Tracker using Linear Prediction
Methods [19.88212227822267]
Two refined DeepFormants trackers were compared with the original DeepFormants and with five known traditional trackers.
The results indicated that the data-driven DeepFormants trackers outperformed the conventional trackers and that the best performance was obtained by refining the formants predicted by DeepFormants using QCP-FB analysis.
arXiv Detail & Related papers (2023-08-17T15:32:32Z) - Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear
Prediction Analysis and Deep Neural Networks [48.98397553726019]
Formant tracking is investigated by using trackers based on dynamic programming (DP) and deep neural nets (DNNs)
The six methods include linear prediction (LP) algorithms, weighted LP algorithms and the recently developed quasi-closed phase forward-backward (QCP-FB) method.
A novel formant tracking approach, which combines benefits of deep learning and signal processing based on QCP-FB, was proposed.
arXiv Detail & Related papers (2022-01-05T10:27:07Z) - SoundDet: Polyphonic Sound Event Detection and Localization from Raw
Waveform [48.68714598985078]
SoundDet is an end-to-end trainable and light-weight framework for polyphonic moving sound event detection and localization.
SoundDet directly consumes the raw, multichannel waveform and treats the temporal sound event as a complete sound-object" to be detected.
A dense sound proposal event map is then constructed to handle the challenges of predicting events with large varying temporal duration.
arXiv Detail & Related papers (2021-06-13T11:43:41Z) - Uncertainty-Aware Signal Temporal logic [21.626420725274208]
Existing temporal logic inference methods mostly neglect uncertainties in the data.
We propose two uncertainty-aware signal temporal logic (STL) inference approaches to classify the undesired behaviors and desired behaviors of a system.
arXiv Detail & Related papers (2021-05-24T21:26:57Z) - On projection methods for functional time series forecasting [0.0]
Two nonparametric methods are presented for forecasting functional time series (FTS)
We address both one-step-ahead forecasting and dynamic updating.
The methods are applied to simulated data, daily electricity demand, and NOx emissions.
arXiv Detail & Related papers (2021-05-10T14:24:38Z) - FlowMOT: 3D Multi-Object Tracking by Scene Flow Association [9.480272707157747]
We propose a LiDAR-based 3D MOT framework named FlowMOT, which integrates point-wise motion information with the traditional matching algorithm.
Our approach outperforms recent end-to-end methods and achieves competitive performance with the state-of-the-art filter-based method.
arXiv Detail & Related papers (2020-12-14T14:03:48Z) - Deep Shells: Unsupervised Shape Correspondence with Optimal Transport [52.646396621449]
We propose a novel unsupervised learning approach to 3D shape correspondence.
We show that the proposed method significantly improves over the state-of-the-art on multiple datasets.
arXiv Detail & Related papers (2020-10-28T22:24:07Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.