Refining a Deep Learning-based Formant Tracker using Linear Prediction
Methods
- URL: http://arxiv.org/abs/2308.09051v1
- Date: Thu, 17 Aug 2023 15:32:32 GMT
- Title: Refining a Deep Learning-based Formant Tracker using Linear Prediction
Methods
- Authors: Paavo Alku, Sudarsana Reddy Kadiri, Dhananjaya Gowda
- Abstract summary: Two refined DeepFormants trackers were compared with the original DeepFormants and with five known traditional trackers.
The results indicated that the data-driven DeepFormants trackers outperformed the conventional trackers and that the best performance was obtained by refining the formants predicted by DeepFormants using QCP-FB analysis.
- Score: 19.88212227822267
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, formant tracking is investigated by refining the formants
tracked by an existing data-driven tracker, DeepFormants, using the formants
estimated in a model-driven manner by linear prediction (LP)-based methods. As
LP-based formant estimation methods, conventional covariance analysis (LP-COV)
and the recently proposed quasi-closed phase forward-backward (QCP-FB) analysis
are used. In the proposed refinement approach, the contours of the three lowest
formants are first predicted by the data-driven DeepFormants tracker, and the
predicted formants are replaced frame-wise with local spectral peaks shown by
the model-driven LP-based methods. The refinement procedure can be plugged into
the DeepFormants tracker with no need for any new data learning. Two refined
DeepFormants trackers were compared with the original DeepFormants and with
five known traditional trackers using the popular vocal tract resonance (VTR)
corpus. The results indicated that the data-driven DeepFormants trackers
outperformed the conventional trackers and that the best performance was
obtained by refining the formants predicted by DeepFormants using QCP-FB
analysis. In addition, by tracking formants using VTR speech that was corrupted
by additive noise, the study showed that the refined DeepFormants trackers were
more resilient to noise than the reference trackers. In general, these results
suggest that LP-based model-driven approaches, which have traditionally been
used in formant estimation, can be combined with a modern data-driven tracker
easily with no further training to improve the tracker's performance.
Related papers
- Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection [16.21235742118949]
We propose a novel approach that repurposes a well-trained Vision-Language Models (VLMs) for general deepfake detection.
Motivated by the model reprogramming paradigm that manipulates the model prediction via data perturbations, our method can reprogram a pretrained VLM model.
Our superior performances are at less cost of trainable parameters, making it a promising approach for real-world applications.
arXiv Detail & Related papers (2024-09-04T12:46:30Z) - Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking
in Speech Signals [17.69029813982043]
We propose a new method for the accurate estimation and tracking of formants in speech signals.
TVQCP analysis combines three approaches to improve formant estimation and tracking.
The proposed TVQCP method performs better than conventional and popular formant tracking tools.
arXiv Detail & Related papers (2023-08-31T08:30:20Z) - Convolutional Neural Networks for the classification of glitches in
gravitational-wave data streams [52.77024349608834]
We classify transient noise signals (i.e.glitches) and gravitational waves in data from the Advanced LIGO detectors.
We use models with a supervised learning approach, both trained from scratch using the Gravity Spy dataset.
We also explore a self-supervised approach, pre-training models with automatically generated pseudo-labels.
arXiv Detail & Related papers (2023-03-24T11:12:37Z) - Site-specific Deep Learning Path Loss Models based on the Method of
Moments [7.894490919875104]
This paper describes deep learning models applied to the problem of predicting EM wave propagation over rural terrain.
A surface integral equation formulation is used to generate synthetic training data which comprises path loss computed over randomly generated 1D terrain profiles.
The models show excellent agreement when applied to test profiles generated using the same statistical process used to create the training data and very good accuracy when applied to real life problems.
arXiv Detail & Related papers (2023-02-02T12:29:38Z) - A Provably Efficient Model-Free Posterior Sampling Method for Episodic
Reinforcement Learning [50.910152564914405]
Existing posterior sampling methods for reinforcement learning are limited by being model-based or lack worst-case theoretical guarantees beyond linear MDPs.
This paper proposes a new model-free formulation of posterior sampling that applies to more general episodic reinforcement learning problems with theoretical guarantees.
arXiv Detail & Related papers (2022-08-23T12:21:01Z) - Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models.
We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets.
Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z) - Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear
Prediction Analysis and Deep Neural Networks [48.98397553726019]
Formant tracking is investigated by using trackers based on dynamic programming (DP) and deep neural nets (DNNs)
The six methods include linear prediction (LP) algorithms, weighted LP algorithms and the recently developed quasi-closed phase forward-backward (QCP-FB) method.
A novel formant tracking approach, which combines benefits of deep learning and signal processing based on QCP-FB, was proposed.
arXiv Detail & Related papers (2022-01-05T10:27:07Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Multi-object Tracking via End-to-end Tracklet Searching and Ranking [11.46601533985954]
We propose a novel method for optimizing tracklet consistency by introducing an online, end-to-end tracklet search training process.
With sequence model as appearance encoders of tracklet, our tracker achieves remarkable performance gain from conventional tracklet association baseline.
Our methods have also achieved state-of-the-art in MOT1517 challenge benchmarks using public detection and online settings.
arXiv Detail & Related papers (2020-03-04T18:46:01Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.