$\alpha$-divergence Improves the Entropy Production Estimation via
Machine Learning
- URL: http://arxiv.org/abs/2303.02901v2
- Date: Fri, 19 Jan 2024 14:53:51 GMT
- Title: $\alpha$-divergence Improves the Entropy Production Estimation via
Machine Learning
- Authors: Euijoon Kwon, Yongjoo Baek
- Abstract summary: We show that there exists a host of loss functions, namely those implementing a variational representation of the $alpha$-divergence.
By fixing $alpha$ to a value between $-1$ and $0$, the $alpha$-NEEP exhibits a much more robust performance against strong nonequilibrium driving or slow dynamics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have seen a surge of interest in the algorithmic estimation of
stochastic entropy production (EP) from trajectory data via machine learning. A
crucial element of such algorithms is the identification of a loss function
whose minimization guarantees the accurate EP estimation. In this study, we
show that there exists a host of loss functions, namely those implementing a
variational representation of the $\alpha$-divergence, which can be used for
the EP estimation. By fixing $\alpha$ to a value between $-1$ and $0$, the
$\alpha$-NEEP (Neural Estimator for Entropy Production) exhibits a much more
robust performance against strong nonequilibrium driving or slow dynamics,
which adversely affects the existing method based on the Kullback-Leibler
divergence ($\alpha = 0$). In particular, the choice of $\alpha = -0.5$ tends
to yield the optimal results. To corroborate our findings, we present an
exactly solvable simplification of the EP estimation problem, whose loss
function landscape and stochastic properties give deeper intuition into the
robustness of the $\alpha$-NEEP.
Related papers
- Bounds on $L_p$ Errors in Density Ratio Estimation via $f$-Divergence Loss Functions [0.0]
Density ratio estimation (DRE) is a fundamental machine learning technique for identifying relationships between two probability distributions.
$f$-divergence loss functions, derived from variational representations of $f$-divergence, are commonly employed in DRE to achieve state-of-the-art results.
This study presents a novel perspective on DRE using $f$-divergence loss functions by deriving the upper and lower bounds on $L_p$ errors.
arXiv Detail & Related papers (2024-10-02T13:05:09Z) - Robust deep learning from weakly dependent data [0.0]
This paper considers robust deep learning from weakly dependent observations, with unbounded loss function and unbounded input/output.
We derive a relationship between these bounds and $r$, and when the data have moments of any order (that is $r=infty$), the convergence rate is close to some well-known results.
arXiv Detail & Related papers (2024-05-08T14:25:40Z) - $α$-Divergence Loss Function for Neural Density Ratio Estimation [0.0]
Density ratio estimation (DRE) is a fundamental machine learning technique for capturing relationships between two probability distributions.
Existing methods face optimization challenges, such as overfitting due to lower-unbounded loss functions, biased mini-batch gradients, vanishing training loss gradients, and high sample requirements for Kullback-Leibler (KL) divergence loss functions.
We propose a novel loss function for DRE, the $alpha$-divergence loss function ($alpha$-Div), which is concise but offers stable and effective optimization for DRE.
arXiv Detail & Related papers (2024-02-03T05:33:01Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$.
Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z) - Retire: Robust Expectile Regression in High Dimensions [3.9391041278203978]
Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data.
We propose and study (penalized) robust expectile regression (retire)
We show that the proposed procedure can be efficiently solved by a semismooth Newton coordinate descent algorithm.
arXiv Detail & Related papers (2022-12-11T18:03:12Z) - On the Pitfalls of Heteroscedastic Uncertainty Estimation with
Probabilistic Neural Networks [23.502721524477444]
We present a synthetic example illustrating how this approach can lead to very poor but stable estimates.
We identify the culprit to be the log-likelihood loss, along with certain conditions that exacerbate the issue.
We present an alternative formulation, termed $beta$-NLL, in which each data point's contribution to the loss is weighted by the $beta$-exponentiated variance estimate.
arXiv Detail & Related papers (2022-03-17T08:46:17Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.