Complex Cepstrum-based Decomposition of Speech for Glottal Source
Estimation
- URL: http://arxiv.org/abs/1912.12602v1
- Date: Sun, 29 Dec 2019 07:58:18 GMT
- Title: Complex Cepstrum-based Decomposition of Speech for Glottal Source
Estimation
- Authors: Thomas Drugman, Baris Bozkurt, Thierry Dutoit
- Abstract summary: We show that complex cepstrum can be effectively used for glottal flow estimation.
Based on exactly the same principles presented for ZZT decomposition, windowing should be applied.
- Score: 11.481208551940998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Homomorphic analysis is a well-known method for the separation of
non-linearly combined signals. More particularly, the use of complex cepstrum
for source-tract deconvolution has been discussed in various articles. However
there exists no study which proposes a glottal flow estimation methodology
based on cepstrum and reports effective results. In this paper, we show that
complex cepstrum can be effectively used for glottal flow estimation by
separating the causal and anticausal components of a windowed speech signal as
done by the Zeros of the Z-Transform (ZZT) decomposition. Based on exactly the
same principles presented for ZZT decomposition, windowing should be applied
such that the windowed speech signals exhibit mixed-phase characteristics which
conform the speech production model that the anticausal component is mainly due
to the glottal flow open phase. The advantage of the complex cepstrum-based
approach compared to the ZZT decomposition is its much higher speed.
Related papers
- An approach to robust ICP initialization [77.45039118761837]
We propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations.
We derive bounds on the robustness of our approach to noise and numerical experiments confirm our theoretical findings.
arXiv Detail & Related papers (2022-12-10T16:27:25Z) - Discretization and Re-synthesis: an alternative method to solve the
Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem.
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols.
By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z) - Lattice partition recovery with dyadic CART [79.96359947166592]
We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature.
In this paper we consider instead the problem of partition recovery, i.e.of estimating the partition of the lattice induced by the constancy regions of the unknown signal.
We prove that a DCART-based procedure consistently estimates the underlying partition at a rate of order $sigma2 k*
arXiv Detail & Related papers (2021-05-27T23:41:01Z) - Glottal source estimation robustness: A comparison of sensitivity of
voice source estimation techniques [11.97036509133719]
This paper addresses the problem of estimating the voice source directly from speech waveforms.
A novel principle based on Anticausality Dominated Regions (ACDR) is used to estimate the glottal open phase.
arXiv Detail & Related papers (2020-05-24T08:13:47Z) - Chirp Complex Cepstrum-based Decomposition for Asynchronous Glottal
Analysis [13.563526970105988]
This paper proposes an extension of the complex cepstrum-based decomposition by incorporating a chirp analysis.
The resulting method is shown to give a reliable estimation of the glottal flow wherever the window is located.
arXiv Detail & Related papers (2020-05-10T17:33:48Z) - Residual-driven Fuzzy C-Means Clustering for Image Segmentation [152.609322951917]
We elaborate on residual-driven Fuzzy C-Means (FCM) for image segmentation.
Built on this framework, we present a weighted $ell_2$-norm fidelity term by weighting mixed noise distribution.
The results demonstrate the superior effectiveness and efficiency of the proposed algorithm over existing FCM-related algorithms.
arXiv Detail & Related papers (2020-04-15T15:46:09Z) - Simultaneous Denoising and Dereverberation Using Deep Embedding Features [64.58693911070228]
We propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features.
At the denoising stage, the DC network is leveraged to extract noise-free deep embedding features.
At the dereverberation stage, instead of using the unsupervised K-means clustering algorithm, another neural network is utilized to estimate the anechoic speech.
arXiv Detail & Related papers (2020-04-06T06:34:01Z) - Residual-Sparse Fuzzy $C$-Means Clustering Incorporating Morphological
Reconstruction and Wavelet frames [146.63177174491082]
Fuzzy $C$-Means (FCM) algorithm incorporates a morphological reconstruction operation and a tight wavelet frame transform.
We present an improved FCM algorithm by imposing an $ell_0$ regularization term on the residual between the feature set and its ideal value.
Experimental results reported for synthetic, medical, and color images show that the proposed algorithm is effective and efficient, and outperforms other algorithms.
arXiv Detail & Related papers (2020-02-14T10:00:03Z) - Theory inspired deep network for instantaneous-frequency extraction and
signal components recovery from discrete blind-source data [1.6758573326215689]
This paper is concerned with the inverse problem of recovering the unknown signal components, along with extraction of their frequencies.
None of the existing decomposition methods and algorithms is capable of solving this inverse problem.
We propose a synthesis of a deep neural network, based directly on a discrete sample set, that may be non-uniformly sampled, of the blind-source signal.
arXiv Detail & Related papers (2020-01-31T18:54:00Z) - Causal-Anticausal Decomposition of Speech using Complex Cepstrum for
Glottal Source Estimation [11.481208551940998]
We show that complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation.
The proposed method has the potential to be used for voice quality analysis.
arXiv Detail & Related papers (2019-12-30T08:12:03Z) - A Deterministic plus Stochastic Model of the Residual Signal for
Improved Parametric Speech Synthesis [11.481208551940998]
We propose an adaptation of the Deterministic plus Model (DSM) for the residual.
The proposed residual model is integrated within a HMM-based speech synthesizer.
Results show a significative improvement for both male and female voices.
arXiv Detail & Related papers (2019-12-29T07:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.