Learning Hidden Markov Models Using Conditional Samples
- URL: http://arxiv.org/abs/2302.14753v2
- Date: Sat, 24 Feb 2024 17:17:11 GMT
- Title: Learning Hidden Markov Models Using Conditional Samples
- Authors: Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang
- Abstract summary: This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM)
In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
- Score: 72.20944611510198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is concerned with the computational complexity of learning the
Hidden Markov Model (HMM). Although HMMs are some of the most widely used tools
in sequential and time series modeling, they are cryptographically hard to
learn in the standard setting where one has access to i.i.d. samples of
observation sequences. In this paper, we depart from this setup and consider an
interactive access model, in which the algorithm can query for samples from the
conditional distributions of the HMMs. We show that interactive access to the
HMM enables computationally efficient learning algorithms, thereby bypassing
cryptographic hardness. Specifically, we obtain efficient algorithms for
learning HMMs in two settings:
(a) An easier setting where we have query access to the exact conditional
probabilities. Here our algorithm runs in polynomial time and makes
polynomially many queries to approximate any HMM in total variation distance.
(b) A harder setting where we can only obtain samples from the conditional
distributions. Here the performance of the algorithm depends on a new
parameter, called the fidelity of the HMM. We show that this captures
cryptographically hard instances and previously known positive results.
We also show that these results extend to a broader class of distributions
with latent low rank structure. Our algorithms can be viewed as generalizations
and robustifications of Angluin's $L^*$ algorithm for learning deterministic
finite automata from membership queries.
Related papers
- LLMs as Probabilistic Minimally Adequate Teachers for DFA Learning [11.037017229299607]
The emergence of intelligence in large language models (LLMs) has inspired investigations into their integration into automata learning.
This paper introduces the probabilistic Minimally Adequate Teacher (pMAT) formulation.
We develop techniques to improve answer accuracy and ensure the correctness of the learned automata.
arXiv Detail & Related papers (2024-08-06T07:12:09Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Cramer Type Distances for Learning Gaussian Mixture Models by Gradient
Descent [0.0]
As of today, few known algorithms can fit or learn Gaussian mixture models.
We propose a distance function called Sliced Cram'er 2-distance for learning general multivariate GMMs.
These features are especially useful for distributional reinforcement learning and Deep Q Networks.
arXiv Detail & Related papers (2023-07-13T13:43:02Z) - Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For
Hidden Markov Models [70.26374282390401]
Decoding the original signal (i.e., hidden chain) from the noisy observations is one of the main goals in nearly all HMM based data analyses.
We present Quick Adaptive Ternary (QATS), a divide-and-conquer procedure which decodes the hidden sequence in polylogarithmic computational complexity.
arXiv Detail & Related papers (2023-05-29T19:37:48Z) - Unified Functional Hashing in Automatic Machine Learning [58.77232199682271]
We show that large efficiency gains can be obtained by employing a fast unified functional hash.
Our hash is "functional" in that it identifies equivalent candidates even if they were represented or coded differently.
We show dramatic improvements on multiple AutoML domains, including neural architecture search and algorithm discovery.
arXiv Detail & Related papers (2023-02-10T18:50:37Z) - The Automatic Quasi-clique Merger algorithm (AQCM) [0.0]
The Automatic Quasi-clique Merger algorithm is a new algorithm adapted from early work published under the name QCM.
We present the general idea of a quasi-clique agglomerative approach, provide the full details of the mathematical steps of the AQCM algorithm, and explain some of the motivation behind the new methodology.
arXiv Detail & Related papers (2021-03-06T20:01:59Z) - A New Algorithm for Hidden Markov Models Learning Problem [0.0]
This research focuses on the algorithms and approaches for learning Hidden Markov Models (HMMs)
HMMs are a statistical Markov model in which the system being modeled is assumed to be a Markov process.
One of the essential characteristics of HMMs is their learning capabilities.
arXiv Detail & Related papers (2021-02-14T09:33:00Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.