Related papers: A Reverse Jensen Inequality Result with Application to Mutual Information Estimation

A Reverse Jensen Inequality Result with Application to Mutual Information Estimation

URL: http://arxiv.org/abs/2111.06676v1
Date: Fri, 12 Nov 2021 11:54:17 GMT
Title: A Reverse Jensen Inequality Result with Application to Mutual Information Estimation
Authors: Gerhard Wunder, Benedikt Gro{\ss}, Rick Fritschek, Rafael F. Schaefer
Abstract summary: In a probabilistic setting, the Jensen inequality describes the relationship between a convex function and the expected value. We show that under minimal constraints and with a proper scaling, the Jensen inequality can be reversed.
Score: 27.35611916229265
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Jensen inequality is a widely used tool in a multitude of fields, such as for example information theory and machine learning. It can be also used to derive other standard inequalities such as the inequality of arithmetic and geometric means or the H\"older inequality. In a probabilistic setting, the Jensen inequality describes the relationship between a convex function and the expected value. In this work, we want to look at the probabilistic setting from the reverse direction of the inequality. We show that under minimal constraints and with a proper scaling, the Jensen inequality can be reversed. We believe that the resulting tool can be helpful for many applications and provide a variational estimation of mutual information, where the reverse inequality leads to a new estimator with superior training behavior compared to current estimators.

Related papers

Tight Bounds on Jensen's Gap: Novel Approach with Applications in Generative Modeling [0.5325390073522079]
We provide a novel technique for finding lower and upper bounds on Jensen's gap. By studying in detail the case of the logarithmic function and the log-normal distribution, we explore a method for tightly estimating the log-likelihood of generative models trained on real-world datasets.
arXiv Detail & Related papers (2025-02-06T11:44:31Z)
Divergence Inequalities with Applications in Ergodic Theory [1.024113475677323]
We establish a simple method for Pinsker inequalities as well as general bounds in terms of $chi2$-divergences for twice-differentiable $f$-divergences. We show for many $f$-divergences the rate of contraction of a time homogeneous Markov chain is characterized by the input-dependent contraction coefficient of the $chi2$-divergence. We extend these results to the Petz $f$-divergences in quantum information theory, albeit without any guarantee of efficient computation.
arXiv Detail & Related papers (2024-11-26T09:06:20Z)
What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning [52.51430732904994]
In reinforcement learning problems, agents must consider long-term fairness while maximizing returns. Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear. We introduce a novel notion called dynamics fairness, which explicitly captures the inequality stemming from environmental dynamics.
arXiv Detail & Related papers (2024-04-16T22:47:59Z)
Certification of multi-qubit quantum systems with temporal inequalities [0.0]
We propose temporal inequalities derived from non-contextuality inequalities for multi-qubit systems. We demonstrate that the new inequalities can be maximally violated via a sequential measurement scenario. We are able to certify multi-qubit graph states and the measurements.
arXiv Detail & Related papers (2024-04-03T13:08:11Z)
Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities [91.12425544503395]
Variational inequalities are used in various applications ranging from equilibrium search to adversarial learning. Most distributed approaches have a bottleneck - the cost of communications. The three main techniques to reduce the total number of communication rounds and the cost of one such round are the similarity of local functions, compression of transmitted information, and local updates. The methods presented in this paper have the best theoretical guarantees of communication complexity and are significantly ahead of other methods for distributed variational inequalities.
arXiv Detail & Related papers (2023-02-15T12:11:27Z)
Information Processing Equalities and the Information-Risk Bridge [10.451984251615512]
We introduce two new classes of measures of information for statistical experiments. We derive a simple geometrical relationship between measures of information and the Bayes risk of a statistical decision problem.
arXiv Detail & Related papers (2022-07-25T08:54:36Z)
Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features. We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z)
Loss function based second-order Jensen inequality and its application to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z)
Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)
New-Type Hoeffding's Inequalities and Application in Tail Bounds [17.714164324169037]
We present a new type of Hoeffding's inequalities, where the high order moments of random variables are taken into account. It can get some considerable improvements in the tail bounds evaluation compared with the known results.
arXiv Detail & Related papers (2021-01-02T03:19:11Z)
Fractional norms and quasinorms do not help to overcome the curse of dimensionality [62.997667081978825]
Using of the Manhattan distance and even fractional quasinorms lp can help to overcome the curse of dimensionality in classification problems. A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant.
arXiv Detail & Related papers (2020-04-29T14:30:12Z)
Concentration inequality using unconfirmed knowledge [2.538209532048867]
We give a concentration inequality based on the premise that random variables take values within a particular region. Our inequality outperforms other well-known inequalities.
arXiv Detail & Related papers (2020-02-11T13:02:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.