Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking
in Speech Signals
- URL: http://arxiv.org/abs/2308.16540v1
- Date: Thu, 31 Aug 2023 08:30:20 GMT
- Title: Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking
in Speech Signals
- Authors: Dhananjaya Gowda, Sudarsana Reddy Kadiri, Brad Story, Paavo Alku
- Abstract summary: We propose a new method for the accurate estimation and tracking of formants in speech signals.
TVQCP analysis combines three approaches to improve formant estimation and tracking.
The proposed TVQCP method performs better than conventional and popular formant tracking tools.
- Score: 17.69029813982043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a new method for the accurate estimation and
tracking of formants in speech signals using time-varying quasi-closed-phase
(TVQCP) analysis. Conventional formant tracking methods typically adopt a
two-stage estimate-and-track strategy wherein an initial set of formant
candidates are estimated using short-time analysis (e.g., 10--50 ms), followed
by a tracking stage based on dynamic programming or a linear state-space model.
One of the main disadvantages of these approaches is that the tracking stage,
however good it may be, cannot improve upon the formant estimation accuracy of
the first stage. The proposed TVQCP method provides a single-stage formant
tracking that combines the estimation and tracking stages into one. TVQCP
analysis combines three approaches to improve formant estimation and tracking:
(1) it uses temporally weighted quasi-closed-phase analysis to derive
closed-phase estimates of the vocal tract with reduced interference from the
excitation source, (2) it increases the residual sparsity by using the $L_1$
optimization and (3) it uses time-varying linear prediction analysis over long
time windows (e.g., 100--200 ms) to impose a continuity constraint on the vocal
tract model and hence on the formant trajectories. Formant tracking experiments
with a wide variety of synthetic and natural speech signals show that the
proposed TVQCP method performs better than conventional and popular formant
tracking tools, such as Wavesurfer and Praat (based on dynamic programming),
the KARMA algorithm (based on Kalman filtering), and DeepFormants (based on
deep neural networks trained in a supervised manner). Matlab scripts for the
proposed method can be found at: https://github.com/njaygowda/ftrack
Related papers
- Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review [59.856222854472605]
This tutorial provides an in-depth guide on inference-time guidance and alignment methods for optimizing downstream reward functions in diffusion models.
practical applications in fields such as biology often require sample generation that maximizes specific metrics.
We discuss (1) fine-tuning methods combined with inference-time techniques, (2) inference-time algorithms based on search algorithms such as Monte Carlo tree search, and (3) connections between inference-time algorithms in language models and diffusion models.
arXiv Detail & Related papers (2025-01-16T17:37:35Z) - ProTracker: Probabilistic Integration for Robust and Accurate Point Tracking [41.889032460337226]
ProTracker is a novel framework for robust and accurate long-term dense tracking of arbitrary points in videos.
Our code and model will be publicly available upon publication.
arXiv Detail & Related papers (2025-01-06T18:55:52Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Refining a Deep Learning-based Formant Tracker using Linear Prediction
Methods [19.88212227822267]
Two refined DeepFormants trackers were compared with the original DeepFormants and with five known traditional trackers.
The results indicated that the data-driven DeepFormants trackers outperformed the conventional trackers and that the best performance was obtained by refining the formants predicted by DeepFormants using QCP-FB analysis.
arXiv Detail & Related papers (2023-08-17T15:32:32Z) - Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear
Prediction Analysis and Deep Neural Networks [48.98397553726019]
Formant tracking is investigated by using trackers based on dynamic programming (DP) and deep neural nets (DNNs)
The six methods include linear prediction (LP) algorithms, weighted LP algorithms and the recently developed quasi-closed phase forward-backward (QCP-FB) method.
A novel formant tracking approach, which combines benefits of deep learning and signal processing based on QCP-FB, was proposed.
arXiv Detail & Related papers (2022-01-05T10:27:07Z) - Uncertainty-Aware Signal Temporal logic [21.626420725274208]
Existing temporal logic inference methods mostly neglect uncertainties in the data.
We propose two uncertainty-aware signal temporal logic (STL) inference approaches to classify the undesired behaviors and desired behaviors of a system.
arXiv Detail & Related papers (2021-05-24T21:26:57Z) - On projection methods for functional time series forecasting [0.0]
Two nonparametric methods are presented for forecasting functional time series (FTS)
We address both one-step-ahead forecasting and dynamic updating.
The methods are applied to simulated data, daily electricity demand, and NOx emissions.
arXiv Detail & Related papers (2021-05-10T14:24:38Z) - Deep Shells: Unsupervised Shape Correspondence with Optimal Transport [52.646396621449]
We propose a novel unsupervised learning approach to 3D shape correspondence.
We show that the proposed method significantly improves over the state-of-the-art on multiple datasets.
arXiv Detail & Related papers (2020-10-28T22:24:07Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.