Related papers: Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

URL: http://arxiv.org/abs/2403.04082v3
Date: Wed, 30 Oct 2024 21:52:16 GMT
Title: Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
Authors: Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine,
Abstract summary: Given time series data, how can we answer questions like "what will happen in the future?" and "how did we get here?" We show how these questions can have compact, closed form solutions in terms of learned representations.
Score: 110.47649327040392
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Given time series data, how can we answer questions like "what will happen in the future?" and "how did we get here?" These sorts of probabilistic inference questions are challenging when observations are high-dimensional. In this paper, we show how these questions can have compact, closed form solutions in terms of learned representations. The key idea is to apply a variant of contrastive learning to time series data. Prior work already shows that the representations learned by contrastive learning encode a probability ratio. By extending prior work to show that the marginal distribution over representations is Gaussian, we can then prove that joint distribution of representations is also Gaussian. Taken together, these results show that representations learned via temporal contrastive learning follow a Gauss-Markov chain, a graphical model where inference (e.g., prediction, planning) over representations corresponds to inverting a low-dimensional matrix. In one special case, inferring intermediate representations will be equivalent to interpolating between the learned representations. We validate our theory using numerical simulations on tasks up to 46-dimensions.

Related papers

The "Law" of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities [23.188014611990152]
This paper builds on work on the geometry and probabilistic interpretation of contrastive representations. We show how these representations can answer many of the same inferences as probabilistic graphical models. Our analysis suggests two new ways of using contrastive representations: in settings with pre-trained contrastive models, and for handling language ambiguity in reinforcement learning.
arXiv Detail & Related papers (2025-01-20T08:10:15Z)
Model Alignment Search [0.0]
We introduce a method for connecting neural representational similarity to behavior through causal interventions.<n>We first show that the method can be used to transfer the behavior from one frozen Neural Network to another in a manner similar to model stitching.<n>We then show how our method can be equivalent to model stitching when desired, or it can take a form that is more restrictive to causal information.
arXiv Detail & Related papers (2025-01-10T18:39:29Z)
Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. We introduce a novel approach to disentangled representation learning based on quadratic optimal transport. We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z)
von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions. The resulting probability model has connections with continuous spin models in statistical physics. For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z)
Explaining Probabilistic Models with Distributional Values [12.26389108393613]
Research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch between what one wishes to explain and what current methods such as SHAP explain. This paper addresses such gap for probabilistic models by generalising cooperative games and value operators.
arXiv Detail & Related papers (2024-02-15T13:50:00Z)
Contrastive Difference Predictive Coding [79.74052624853303]
We introduce a temporal difference version of contrastive predictive coding that stitches together pieces of different time series data to decrease the amount of data required to learn predictions of future events. We apply this representation learning method to derive an off-policy algorithm for goal-conditioned RL.
arXiv Detail & Related papers (2023-10-31T03:16:32Z)
PAVI: Plate-Amortized Variational Inference [55.975832957404556]
Inference is challenging for large population studies where millions of measurements are performed over a cohort of hundreds of subjects. This large cardinality renders off-the-shelf Variational Inference (VI) computationally impractical. In this work, we design structured VI families that efficiently tackle large population studies.
arXiv Detail & Related papers (2023-08-30T13:22:20Z)
On the Dynamics of Inference and Learning [0.0]
We present a treatment of this Bayesian updating process as a continuous dynamical system. We show that when the Cram'er-Rao bound is saturated the learning rate is governed by a simple $1/T$ power-law.
arXiv Detail & Related papers (2022-04-19T18:04:36Z)
A visual introduction to Gaussian Belief Propagation [22.02770204949673]
We present a visual introduction to the approximate probabilistic inference algorithm that operates by passing messages between the nodes of arbitrarily structured factor graphs. A special case of loopy belief propagation, GBP updates rely only on local information and will converge independently of the message schedule. Our key argument is that, given recent trends in computing hardware, GBP has the right computational properties to act as a scalable distributed probabilistic inference framework for future machine learning systems.
arXiv Detail & Related papers (2021-07-05T22:43:27Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
Probabilistic Future Prediction for Video Scene Understanding [11.236856606065514]
We present a novel deep learning architecture for probabilistic future prediction from video. We predict the future semantics, motion of complex real-world urban scenes and use this representation to control an autonomous vehicle.
arXiv Detail & Related papers (2020-03-13T17:48:21Z)
Elements of Sequential Monte Carlo [21.1067925312595]
Core problem in statistics and machine learning is to compute probability distributions and expectations. Key challenge is to approximate these intractable expectations. sequential Monte Carlo (SMC) is a random-sampling-based class of methods for approximate inference.
arXiv Detail & Related papers (2019-03-12T09:28:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.