A Theory of Unsupervised Speech Recognition
- URL: http://arxiv.org/abs/2306.07926v1
- Date: Fri, 9 Jun 2023 08:12:27 GMT
- Title: A Theory of Unsupervised Speech Recognition
- Authors: Liming Wang, Mark Hasegawa-Johnson and Chang D. Yoo
- Abstract summary: Unsupervised speech recognition (ASR-U) is the problem of learning automatic speech recognition systems from unpaired speech-only and text-only corpora.
We propose a general theoretical framework to study the properties of ASR-U systems based on random matrix theory and the theory of neural tangent kernels.
- Score: 60.12287608968879
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unsupervised speech recognition (ASR-U) is the problem of learning automatic
speech recognition (ASR) systems from unpaired speech-only and text-only
corpora. While various algorithms exist to solve this problem, a theoretical
framework is missing from studying their properties and addressing such issues
as sensitivity to hyperparameters and training instability. In this paper, we
proposed a general theoretical framework to study the properties of ASR-U
systems based on random matrix theory and the theory of neural tangent kernels.
Such a framework allows us to prove various learnability conditions and sample
complexity bounds of ASR-U. Extensive ASR-U experiments on synthetic languages
with three classes of transition graphs provide strong empirical evidence for
our theory (code available at cactuswiththoughts/UnsupASRTheory.git).
Related papers
- Towards Unsupervised Speech Recognition Without Pronunciation Models [57.222729245842054]
Most languages lack sufficient paired speech and text data to effectively train automatic speech recognition systems.
We propose the removal of reliance on a phoneme lexicon to develop unsupervised ASR systems.
We experimentally demonstrate that an unsupervised speech recognizer can emerge from joint speech-to-speech and text-to-text masked token-infilling.
arXiv Detail & Related papers (2024-06-12T16:30:58Z) - Architecture of a Cortex Inspired Hierarchical Event Recaller [0.0]
This paper proposes a new approach to Machine Learning (ML) that focuses on unsupervised continuous context-dependent learning of complex patterns.
A synthetic structure capable of identifying and predicting complex temporal series will be defined and experimentally tested.
As a proof of concept, the proposed system is shown to be able to learn, identify and predict a remarkably complex temporal series such as human speech, with no prior knowledge.
arXiv Detail & Related papers (2024-05-03T09:36:16Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Learnability with PAC Semantics for Multi-agent Beliefs [38.88111785113001]
The tension between deduction and induction is perhaps the most fundamental issue in areas such as philosophy, cognition and artificial intelligence.
Valiant recognised that the challenge of learning should be integrated with deduction.
Although weaker than classical entailment, it allows for a powerful model-theoretic framework for answering queries.
arXiv Detail & Related papers (2023-06-08T18:22:46Z) - Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework.
We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.
arXiv Detail & Related papers (2023-06-05T10:45:39Z) - SAT-Based PAC Learning of Description Logic Concepts [18.851061569487616]
We propose bounded fitting as a scheme for learning logic concepts in the presence of description.
We present the system SPELL which implements bounded fitting for the description logic $mathcalELHr$ based on a SAT solver, and compare its performance to a state-of-the-art learner.
arXiv Detail & Related papers (2023-05-15T10:20:31Z) - A Parameterized Theory of PAC Learning [19.686465068713467]
Probably Approximately Correct (i.e., PAC) learning is a core concept of sample complexity theory.
We develop a theory of parameterized PAC learning which allows us to shed new light on several recent PAC learning results that incorporated elements of parameterized complexity.
arXiv Detail & Related papers (2023-04-27T09:39:30Z) - Non-Axiomatic Term Logic: A Computational Theory of Cognitive Symbolic
Reasoning [3.344997561878685]
Non-Axiomatic Term Logic (NATL) is a theoretical computational framework of humanlike symbolic reasoning in artificial intelligence.
NATL unites a discrete syntactic system inspired from Aristotle's term logic and a continuous semantic system based on the modern idea of distributed representations.
arXiv Detail & Related papers (2022-10-12T15:31:35Z) - A Free Lunch from the Noise: Provable and Practical Exploration for
Representation Learning [55.048010996144036]
We show that under some noise assumption, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free.
We propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise.
arXiv Detail & Related papers (2021-11-22T19:24:57Z) - Nonlinear ISA with Auxiliary Variables for Learning Speech
Representations [51.9516685516144]
We introduce a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables.
We propose an algorithm that learns unsupervised speech representations whose subspaces are independent.
arXiv Detail & Related papers (2020-07-25T14:53:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.