Related papers: Kullback-Leibler Barycentre of Stochastic Processes

Kullback-Leibler Barycentre of Stochastic Processes

URL: http://arxiv.org/abs/2407.04860v1
Date: Fri, 5 Jul 2024 20:45:27 GMT
Title: Kullback-Leibler Barycentre of Stochastic Processes
Authors: Sebastian Jaimungal, Silvana M. Pesenti,
Abstract summary: We consider the problem where an agent aims to combine the views and insights of different experts' models. We show existence and uniqueness of the barycentre model and proof an explicit representation of the Radon-Nikodym derivative. Two deep learning algorithms are proposed to find the optimal drift of the combined model.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We consider the problem where an agent aims to combine the views and insights of different experts' models. Specifically, each expert proposes a diffusion process over a finite time horizon. The agent then combines the experts' models by minimising the weighted Kullback-Leibler divergence to each of the experts' models. We show existence and uniqueness of the barycentre model and proof an explicit representation of the Radon-Nikodym derivative relative to the average drift model. We further allow the agent to include their own constraints, which results in an optimal model that can be seen as a distortion of the experts' barycentre model to incorporate the agent's constraints. Two deep learning algorithms are proposed to find the optimal drift of the combined model, allowing for efficient simulations. The first algorithm aims at learning the optimal drift by matching the change of measure, whereas the second algorithm leverages the notion of elicitability to directly estimate the value function. The paper concludes with a extended application to combine implied volatility smiles models that were estimated on different datasets.

Related papers

Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models. We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Operator-informed score matching for Markov diffusion models [8.153690483716481]
This paper argues that Markov diffusion models enjoy an advantage over other types of diffusion model, as their associated operators can be exploited to improve the training process. We propose operator-informed score matching, a variance reduction technique that is straightforward to implement in both low- and high-dimensional diffusion modeling.
arXiv Detail & Related papers (2024-06-13T13:07:52Z)
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP) MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved. We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z)
Dirichlet process mixture models for non-stationary data streams [0.0]
We propose a variational inference algorithm for Dirichlet process mixture models. Our proposal deals with the concept drift by including an exponential forgetting over the prior global parameters. Our algorithm allows to adapt the learned model to the concept drifts automatically.
arXiv Detail & Related papers (2022-10-13T09:57:07Z)
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning [21.931580762349096]
We introduce an algorithm that computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model. We prove an information-theoretic, Bayesian regret bound for our algorithm that holds for any finite-horizon, episodic sequential decision-making problem.
arXiv Detail & Related papers (2022-06-04T23:36:38Z)
Distributional Depth-Based Estimation of Object Articulation Models [21.046351215949525]
We propose a method that efficiently learns distributions over articulation model parameters directly from depth images. Our core contributions include a novel representation for distributions over rigid body transformations. We introduce a novel deep learning based approach, DUST-net, that performs category-independent articulation model estimation.
arXiv Detail & Related papers (2021-08-12T17:44:51Z)
Loss function based second-order Jensen inequality and its application to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z)
A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling. We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk. We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z)
A bandit-learning approach to multifidelity approximation [7.960229223744695]
Multifidelity approximation is an important technique in scientific computation and simulation. We introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates.
arXiv Detail & Related papers (2021-03-29T05:29:35Z)
Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution. This approach poses a number of implementation and optimization challenges. We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z)
BODAME: Bilevel Optimization for Defense Against Model Extraction [10.877450596327407]
We consider an adversarial setting to prevent model extraction under the assumption that will make best guess on the service provider's attacker. We formulate a surrogate model using the predictions of the true model. We give a tractable transformation and an algorithm for more complicated models that are learned by using gradient descent-based algorithms.
arXiv Detail & Related papers (2021-03-11T17:08:31Z)
Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.