Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation
- URL: http://arxiv.org/abs/1912.13151v2
- Date: Wed, 17 Jun 2020 16:32:42 GMT
- Title: Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation
- Authors: Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou
- Abstract summary: We adapt contextual generation of categorical sequences to a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
We also demonstrate the use of correlated MC rollouts for binary-tree softmax models, which reduce the high generation cost in large vocabulary scenarios.
- Score: 77.7420231319632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequence generation models are commonly refined with reinforcement learning
over user-defined metrics. However, high gradient variance hinders the
practical use of this method. To stabilize this method, we adapt to contextual
generation of categorical sequences a policy gradient estimator, which
evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
Due to the correlation, the number of unique rollouts is random and adaptive to
model uncertainty; those rollouts naturally become baselines for each other,
and hence are combined to effectively reduce gradient variance. We also
demonstrate the use of correlated MC rollouts for binary-tree softmax models,
which reduce the high generation cost in large vocabulary scenarios by
decomposing each categorical action into a sequence of binary actions. We
evaluate our methods on both neural program synthesis and image captioning. The
proposed methods yield lower gradient variance and consistent improvement over
related baselines.
Related papers
- Obtaining Explainable Classification Models using Distributionally
Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules.
An inherent trade-off exists between rule set sparsity and its prediction accuracy.
We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z) - SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences.
We formulate sequence generation as an imitation learning (IL) problem.
This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset.
Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Two-level monotonic multistage recommender systems [5.983189537988243]
Two-level monotonic property characterizing a monotonic chain of events for personalized prediction.
Regularized cost function to learn user-specific behaviors at different stages.
Algorithm based on blockwise coordinate descent.
arXiv Detail & Related papers (2021-10-06T08:50:32Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Gaussian Process Models with Low-Rank Correlation Matrices for Both
Continuous and Categorical Inputs [0.0]
We introduce a method that uses low-rank approximations of cross-correlation matrices in mixed continuous and categorical Gaussian Process models.
Low-Rank Correlation (LRC) offers the ability to flexibly adapt the number of parameters to the problem at hand by choosing an appropriate rank of the approximation.
arXiv Detail & Related papers (2020-10-06T09:38:35Z) - Fitting Laplacian Regularized Stratified Gaussian Models [0.0]
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data.
We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
arXiv Detail & Related papers (2020-05-04T18:00:59Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.