Rescuing neural spike train models from bad MLE
- URL: http://arxiv.org/abs/2010.12362v1
- Date: Fri, 23 Oct 2020 12:46:12 GMT
- Title: Rescuing neural spike train models from bad MLE
- Authors: Diego M. Arribas, Yuan Zhao and Il Memming Park
- Abstract summary: We propose to directly minimize the divergence between neural recorded and model generated spike trains using spike train kernels.
We show that we can control the trade-off between different features which is critical for dealing with model-mismatch.
- Score: 9.655669944739127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The standard approach to fitting an autoregressive spike train model is to
maximize the likelihood for one-step prediction. This maximum likelihood
estimation (MLE) often leads to models that perform poorly when generating
samples recursively for more than one time step. Moreover, the generated spike
trains can fail to capture important features of the data and even show
diverging firing rates. To alleviate this, we propose to directly minimize the
divergence between neural recorded and model generated spike trains using spike
train kernels. We develop a method that stochastically optimizes the maximum
mean discrepancy induced by the kernel. Experiments performed on both real and
synthetic neural data validate the proposed approach, showing that it leads to
well-behaving models. Using different combinations of spike train kernels, we
show that we can control the trade-off between different features which is
critical for dealing with model-mismatch.
Related papers
- Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Diffusion-Model-Assisted Supervised Learning of Generative Models for
Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation.
We use the score-based diffusion model to generate labeled data.
Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z) - SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences.
We formulate sequence generation as an imitation learning (IL) problem.
This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset.
Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z) - Minimizing Trajectory Curvature of ODE-based Generative Models [45.89620603363946]
Recent generative models, such as diffusion models, rectified flows, and flow matching, define a generative process as a time reversal of a fixed forward process.
We present an efficient method of training the forward process to minimize the curvature of generative trajectories without any ODE/SDE simulation.
arXiv Detail & Related papers (2023-01-27T21:52:03Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Model-based micro-data reinforcement learning: what are the crucial
model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models.
We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin.
We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Point process models for sequence detection in high-dimensional neural
spike trains [29.073129195368235]
We develop a point process model that characterizes fine-scale sequences at the level of individual spikes.
This ultra-sparse representation of sequence events opens new possibilities for spike train modeling.
arXiv Detail & Related papers (2020-10-10T02:21:44Z) - Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization
without Compounding Errors [10.906666680425754]
We propose a Dyna-style model-based reinforcement learning algorithm, which we called Maximum Entropy Model Rollouts (MEMR)
To eliminate the compounding errors, we only use our model to generate single-step rollouts.
arXiv Detail & Related papers (2020-06-08T21:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.