When predict can also explain: few-shot prediction to select better neural latents
- URL: http://arxiv.org/abs/2405.14425v2
- Date: Mon, 10 Jun 2024 11:30:28 GMT
- Title: When predict can also explain: few-shot prediction to select better neural latents
- Authors: Kabir Dabholkar, Omri Barak,
- Abstract summary: We present a novel prediction metric designed to yield latent variables that more accurately reflect the ground truth.
In the absence of ground truth, we suggest a proxy measure to quantify extraneous dynamics.
- Score: 3.6218162133579703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Latent variable models serve as powerful tools to infer underlying dynamics from observed neural activity. However, due to the absence of ground truth data, prediction benchmarks are often employed as proxies. In this study, we reveal the limitations of the widely-used 'co-smoothing' prediction framework and propose an improved few-shot prediction approach that encourages more accurate latent dynamics. Utilizing a student-teacher setup with Hidden Markov Models, we demonstrate that the high co-smoothing model space can encompass models with arbitrary extraneous dynamics within their latent representations. To address this, we introduce a secondary metric -- a few-shot version of co-smoothing. This involves performing regression from the latent variables to held-out channels in the data using fewer trials. Our results indicate that among models with near-optimal co-smoothing, those with extraneous dynamics underperform in the few-shot co-smoothing compared to 'minimal' models devoid of such dynamics. We also provide analytical insights into the origin of this phenomenon. We further validate our findings on real neural data using two state-of-the-art methods: LFADS and STNDT. In the absence of ground truth, we suggest a proxy measure to quantify extraneous dynamics. By cross-decoding the latent variables of all model pairs with high co-smoothing, we identify models with minimal extraneous dynamics. We find a correlation between few-shot co-smoothing performance and this new measure. In summary, we present a novel prediction metric designed to yield latent variables that more accurately reflect the ground truth, offering a significant improvement for latent dynamics inference.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Deep Learning for Koopman Operator Estimation in Idealized Atmospheric Dynamics [2.2489531925874013]
Deep learning is revolutionizing weather forecasting, with new data-driven models achieving accuracy on par with operational physical models for medium-term predictions.
These models often lack interpretability, making their underlying dynamics difficult to understand and explain.
This paper proposes methodologies to estimate the Koopman operator, providing a linear representation of complex nonlinear dynamics to enhance the transparency of data-driven models.
arXiv Detail & Related papers (2024-09-10T13:56:54Z) - CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM)
CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models.
We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Disentangled Neural Relational Inference for Interpretable Motion
Prediction [38.40799770648501]
We develop a variational auto-encoder framework that integrates graph-based representations and timesequence models.
Our model infers dynamic interaction graphs augmented with interpretable edge features that characterize the interactions.
We validate our approach through extensive experiments on both simulated and real-world datasets.
arXiv Detail & Related papers (2024-01-07T22:49:24Z) - Efficient Dynamics Modeling in Interactive Environments with Koopman Theory [22.7309724944471]
We show how to efficiently parallelize the sequential problem of long-range prediction using convolution.
We also show that this model can be easily incorporated into dynamics modeling for model-based planning and model-free RL.
arXiv Detail & Related papers (2023-06-20T23:38:24Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - End-to-End Learning of Hybrid Inverse Dynamics Models for Precise and
Compliant Impedance Control [16.88250694156719]
We present a novel hybrid model formulation that enables us to identify fully physically consistent inertial parameters of a rigid body dynamics model.
We compare our approach against state-of-the-art inverse dynamics models on a 7 degree of freedom manipulator.
arXiv Detail & Related papers (2022-05-27T07:39:28Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - Prediction-Centric Learning of Independent Cascade Dynamics from Partial
Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate.
We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach.
We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.