A Reverse Jensen Inequality Result with Application to Mutual
Information Estimation
- URL: http://arxiv.org/abs/2111.06676v1
- Date: Fri, 12 Nov 2021 11:54:17 GMT
- Title: A Reverse Jensen Inequality Result with Application to Mutual
Information Estimation
- Authors: Gerhard Wunder, Benedikt Gro{\ss}, Rick Fritschek, Rafael F. Schaefer
- Abstract summary: In a probabilistic setting, the Jensen inequality describes the relationship between a convex function and the expected value.
We show that under minimal constraints and with a proper scaling, the Jensen inequality can be reversed.
- Score: 27.35611916229265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Jensen inequality is a widely used tool in a multitude of fields, such as
for example information theory and machine learning. It can be also used to
derive other standard inequalities such as the inequality of arithmetic and
geometric means or the H\"older inequality. In a probabilistic setting, the
Jensen inequality describes the relationship between a convex function and the
expected value. In this work, we want to look at the probabilistic setting from
the reverse direction of the inequality. We show that under minimal constraints
and with a proper scaling, the Jensen inequality can be reversed. We believe
that the resulting tool can be helpful for many applications and provide a
variational estimation of mutual information, where the reverse inequality
leads to a new estimator with superior training behavior compared to current
estimators.
Related papers
- What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning [52.51430732904994]
In reinforcement learning problems, agents must consider long-term fairness while maximizing returns.
Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear.
We introduce a novel notion called dynamics fairness, which explicitly captures the inequality stemming from environmental dynamics.
arXiv Detail & Related papers (2024-04-16T22:47:59Z) - Certification of multi-qubit quantum systems with temporal inequalities [0.0]
We propose temporal inequalities derived from non-contextuality inequalities for multi-qubit systems.
We demonstrate that the new inequalities can be maximally violated via a sequential measurement scenario.
We are able to certify multi-qubit graph states and the measurements.
arXiv Detail & Related papers (2024-04-03T13:08:11Z) - The Representation Jensen-Shannon Divergence [0.0]
Quantifying the difference between probability distributions is crucial in machine learning.
This work proposes the representation Jensen-Shannon divergence (RJSD), a novel measure inspired by the traditional Jensen-Shannon divergence.
Our results demonstrate RJSD's superiority in two-sample testing, distribution shift detection, and unsupervised domain adaptation.
arXiv Detail & Related papers (2023-05-25T19:44:36Z) - Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities [91.12425544503395]
Variational inequalities are used in various applications ranging from equilibrium search to adversarial learning.
Most distributed approaches have a bottleneck - the cost of communications.
The three main techniques to reduce the total number of communication rounds and the cost of one such round are the similarity of local functions, compression of transmitted information, and local updates.
The methods presented in this paper have the best theoretical guarantees of communication complexity and are significantly ahead of other methods for distributed variational inequalities.
arXiv Detail & Related papers (2023-02-15T12:11:27Z) - Information Processing Equalities and the Information-Risk Bridge [10.451984251615512]
We introduce two new classes of measures of information for statistical experiments.
We derive a simple geometrical relationship between measures of information and the Bayes risk of a statistical decision problem.
arXiv Detail & Related papers (2022-07-25T08:54:36Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - New-Type Hoeffding's Inequalities and Application in Tail Bounds [17.714164324169037]
We present a new type of Hoeffding's inequalities, where the high order moments of random variables are taken into account.
It can get some considerable improvements in the tail bounds evaluation compared with the known results.
arXiv Detail & Related papers (2021-01-02T03:19:11Z) - Fractional norms and quasinorms do not help to overcome the curse of
dimensionality [62.997667081978825]
Using of the Manhattan distance and even fractional quasinorms lp can help to overcome the curse of dimensionality in classification problems.
A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant.
arXiv Detail & Related papers (2020-04-29T14:30:12Z) - Concentration inequality using unconfirmed knowledge [2.538209532048867]
We give a concentration inequality based on the premise that random variables take values within a particular region.
Our inequality outperforms other well-known inequalities.
arXiv Detail & Related papers (2020-02-11T13:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.