Related papers: Tensor Networks for Probabilistic Sequence Modeling

Tensor Networks for Probabilistic Sequence Modeling

URL: http://arxiv.org/abs/2003.01039v4
Date: Fri, 23 Apr 2021 16:52:06 GMT
Title: Tensor Networks for Probabilistic Sequence Modeling
Authors: Jacob Miller, Guillaume Rabusseau, John Terilla
Abstract summary: We use a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample from a wide variety of conditional distributions. Experiments on sequence modeling with synthetic and real text data show u-MPS outperforming a variety of baselines.
Score: 7.846449972735859
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tensor networks are a powerful modeling framework developed for computational many-body physics, which have only recently been applied within machine learning. In this work we utilize a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We first show that u-MPS enable sequence-level parallelism, with length-n sequences able to be evaluated in depth O(log n). We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample from a wide variety of conditional distributions, each one defined by a regular expression. Special cases of this algorithm correspond to autoregressive and fill-in-the-blank sampling, but more complex regular expressions permit the generation of richly structured data in a manner that has no direct analogue in neural generative models. Experiments on sequence modeling with synthetic and real text data show u-MPS outperforming a variety of baselines and effectively generalizing their predictions in the presence of limited data.

Related papers

Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Estimating Network Models using Neural Networks [0.0]
We propose a neural network approach that trains on a single, large set of parameter-simulation pairs to learn the mapping from parameters to average network statistics. Once trained, this map can be inverted, yielding a fast and parallelizable estimation method.
arXiv Detail & Related papers (2025-02-03T20:41:06Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Quantum Generative Modeling of Sequential Data with Trainable Token Embedding [0.0]
A quantum-inspired generative model known as the Born machines have shown great advancements in learning classical and quantum data. We generalize the embedding method into trainable quantum measurement operators that can be simultaneously honed with MPS. Our study indicated that combined with trainable embedding, Born machines can exhibit better performance and learn deeper correlations from the dataset.
arXiv Detail & Related papers (2023-11-08T22:56:37Z)
Approximate Message Passing for the Matrix Tensor Product Model [8.206394018475708]
We propose and analyze an approximate message passing (AMP) algorithm for the matrix tensor product model. Building upon an convergence theorem for non-separable functions, we prove a state evolution for non-separable functions. We leverage this state evolution result to provide necessary and sufficient conditions for recovery of the signal of interest.
arXiv Detail & Related papers (2023-06-27T16:03:56Z)
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences. We formulate sequence generation as an imitation learning (IL) problem. This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset. Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z)
Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models. Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z)
Generative Adversarial Network for Probabilistic Forecast of Random Dynamical System [19.742888499307178]
We present a deep learning model for data-driven simulations of random dynamical systems without a distributional assumption. We propose a regularization strategy for a generative adversarial network based on consistency conditions for the sequential inference problems. The behavior of the proposed model is studied by using three processes with complex noise structures.
arXiv Detail & Related papers (2021-11-04T19:50:56Z)
Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model. We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z)
Deep Probabilistic Time Series Forecasting using Augmented Recurrent Input for Dynamic Systems [12.319812075685956]
We combine the advances in both deep generative models and state space model (SSM) to come up with a novel, data-driven deep probabilistic sequence model. Specially, we follow the popular encoder-decoder generative structure to build the recurrent neural networks (RNN) assisted variational sequence model. In order to alleviate the issue of inconsistency between training and predicting, we (i) propose using a hybrid output as input at next time step, which brings training and predicting into alignment.
arXiv Detail & Related papers (2021-06-03T23:41:11Z)
Deep Neural Dynamic Bayesian Networks applied to EEG sleep spindles modeling [0.0]
We propose a generative model for single-channel EEG that incorporates the constraints experts actively enforce during visual scoring. We derive algorithms for exact, tractable inference as a special case of Generalized Expectation Maximization. We validate the model on three public datasets and provide support that more complex models are able to surpass state-of-the-art detectors.
arXiv Detail & Related papers (2020-10-16T21:48:29Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Graph Gamma Process Generalized Linear Dynamical Systems [60.467040479276704]
We introduce graph gamma process (GGP) linear dynamical systems to model real multivariate time series. For temporal pattern discovery, the latent representation under the model is used to decompose the time series into a parsimonious set of multivariate sub-sequences. We use the generated random graph, whose number of nonzero-degree nodes is finite, to define both the sparsity pattern and dimension of the latent state transition matrix.
arXiv Detail & Related papers (2020-07-25T04:16:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.