A Differentiable Partially Observable Generalized Linear Model with
Forward-Backward Message Passing
- URL: http://arxiv.org/abs/2402.01263v2
- Date: Wed, 7 Feb 2024 18:44:41 GMT
- Title: A Differentiable Partially Observable Generalized Linear Model with
Forward-Backward Message Passing
- Authors: Chengrui Li, Weihan Li, Yule Wang, and Anqi Wu
- Abstract summary: We propose a new differentiable POGLM, which enables the pathwise gradient estimator, better than the score function gradient estimator used in existing works.
Our new method yields more interpretable parameters, underscoring its significance in neuroscience.
- Score: 2.600709013150986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The partially observable generalized linear model (POGLM) is a powerful tool
for understanding neural connectivity under the assumption of existing hidden
neurons. With spike trains only recorded from visible neurons, existing works
use variational inference to learn POGLM meanwhile presenting the difficulty of
learning this latent variable model. There are two main issues: (1) the sampled
Poisson hidden spike count hinders the use of the pathwise gradient estimator
in VI; and (2) the existing design of the variational model is neither
expressive nor time-efficient, which further affects the performance. For (1),
we propose a new differentiable POGLM, which enables the pathwise gradient
estimator, better than the score function gradient estimator used in existing
works. For (2), we propose the forward-backward message-passing sampling scheme
for the variational model. Comprehensive experiments show that our
differentiable POGLMs with our forward-backward message passing produce a
better performance on one synthetic and two real-world datasets. Furthermore,
our new method yields more interpretable parameters, underscoring its
significance in neuroscience.
Related papers
- A Multi-step Loss Function for Robust Learning of the Dynamics in
Model-based Reinforcement Learning [10.940666275830052]
In model-based reinforcement learning, most algorithms rely on simulating trajectories from one-step models of the dynamics learned on data.
We tackle this issue by using a multi-step objective to train one-step models.
We find that this new loss is particularly useful when the data is noisy, which is often the case in real-life environments.
arXiv Detail & Related papers (2024-02-05T16:13:00Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique.
The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z) - Recoding latent sentence representations -- Dynamic gradient-based
activation modification in RNNs [0.0]
In RNNs, encoding information in a suboptimal way can impact the quality of representations based on later elements in the sequence.
I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism.
I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail.
arXiv Detail & Related papers (2021-01-03T17:54:17Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Learning Variational Data Assimilation Models and Solvers [34.22350850350653]
We introduce end-to-end neural network architectures for data assimilation.
A key feature of the proposed end-to-end learning architecture is that we may train the NN models using both supervised and unsupervised strategies.
arXiv Detail & Related papers (2020-07-25T14:28:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.