Chirp Complex Cepstrum-based Decomposition for Asynchronous Glottal
Analysis
- URL: http://arxiv.org/abs/2005.04724v1
- Date: Sun, 10 May 2020 17:33:48 GMT
- Title: Chirp Complex Cepstrum-based Decomposition for Asynchronous Glottal
Analysis
- Authors: Thomas Drugman, Thierry Dutoit
- Abstract summary: This paper proposes an extension of the complex cepstrum-based decomposition by incorporating a chirp analysis.
The resulting method is shown to give a reliable estimation of the glottal flow wherever the window is located.
- Score: 13.563526970105988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It was recently shown that complex cepstrum can be effectively used for
glottal flow estimation by separating the causal and anticausal components of
speech. In order to guarantee a correct estimation, some constraints on the
window have been derived. Among these, the window has to be synchronized on a
Glottal Closure Instant. This paper proposes an extension of the complex
cepstrum-based decomposition by incorporating a chirp analysis. The resulting
method is shown to give a reliable estimation of the glottal flow wherever the
window is located. This technique is then suited for its integration in usual
speech processing systems, which generally operate in an asynchronous way.
Besides its potential for automatic voice quality analysis is highlighted.
Related papers
- Harmonic Path Integral Diffusion [0.4527270266697462]
We present a novel approach for sampling from a continuous multivariate probability distribution, which may either be explicitly known (up to a normalization factor) or represented via empirical samples.
Our method constructs a time-dependent bridge from a delta function centered at the origin of the state space at $t=0$, transforming it into the target distribution at $t=1$.
We contrast these algorithms with other sampling methods, particularly simulated and path integral sampling, highlighting their advantages in terms of analytical control, accuracy, and computational efficiency.
arXiv Detail & Related papers (2024-09-23T16:20:21Z) - Bayesian estimation for collisional thermometry and time-optimal
holonomic quantum computation [0.0]
In the first half we investigate how the Bayesian formalism can be introduced into the problem of quantum thermometry.
In the last part of the thesis we approach the problem of non-adiabatic holonomic computation.
arXiv Detail & Related papers (2023-07-16T17:46:13Z) - Tight integration of neural- and clustering-based diarization through
deep unfolding of infinite Gaussian mixture model [84.57667267657382]
This paper introduces a it trainable clustering algorithm into the integration framework.
Speaker embeddings are optimized during training such that it better fits iGMM clustering.
Experimental results show that the proposed approach outperforms the conventional approach in terms of diarization error rate.
arXiv Detail & Related papers (2022-02-14T07:45:21Z) - Discretization and Re-synthesis: an alternative method to solve the
Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem.
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols.
By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z) - Visualizing Classifier Adjacency Relations: A Case Study in Speaker
Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z) - Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for
End-to-End Speech Systems [78.5097679815944]
This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems.
First, we represent speech signals with 2D spectrograms using the short-time Fourier transform.
Second, we iteratively find a safe vector using a spectrogram subspace projection operation.
Third, we synthesize a spectrogram with such a safe vector using a novel GAN architecture trained with Sobolev integral probability metric.
arXiv Detail & Related papers (2021-03-15T01:11:13Z) - Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
Modeling [61.351967629600594]
This paper proposes an any-to-many location-relative, sequence-to-sequence (seq2seq), non-parallel voice conversion approach.
In this approach, we combine a bottle-neck feature extractor (BNE) with a seq2seq synthesis module.
Objective and subjective evaluations show that the proposed any-to-many approach has superior voice conversion performance in terms of both naturalness and speaker similarity.
arXiv Detail & Related papers (2020-09-06T13:01:06Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z) - Causal-Anticausal Decomposition of Speech using Complex Cepstrum for
Glottal Source Estimation [11.481208551940998]
We show that complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation.
The proposed method has the potential to be used for voice quality analysis.
arXiv Detail & Related papers (2019-12-30T08:12:03Z) - Glottal Source Processing: from Analysis to Applications [35.80742217666323]
glottal analysis from speech recordings requires specific and more complex processing operations.
This review gives a general overview of techniques which have been designed for glottal source processing.
arXiv Detail & Related papers (2019-12-29T08:13:58Z) - Complex Cepstrum-based Decomposition of Speech for Glottal Source
Estimation [11.481208551940998]
We show that complex cepstrum can be effectively used for glottal flow estimation.
Based on exactly the same principles presented for ZZT decomposition, windowing should be applied.
arXiv Detail & Related papers (2019-12-29T07:58:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.