The Principle of Uncertain Maximum Entropy
- URL: http://arxiv.org/abs/2305.09868v2
- Date: Mon, 19 Jun 2023 19:46:32 GMT
- Title: The Principle of Uncertain Maximum Entropy
- Authors: Kenneth Bogert, Matthew Kothe
- Abstract summary: The principle of maximum entropy has contributed to advancements in various domains such as Statistical Mechanics, Machine Learning, and Ecology.
Here we show the Principle of Uncertain Maximum Entropy as a method that both encodes all available information in spite of arbitrarily noisy observations.
We utilize the output of a black-box machine learning model as input into an uncertain maximum entropy model, resulting in a novel approach for scenarios where the observation function is unavailable.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The principle of maximum entropy, as introduced by Jaynes in information
theory, has contributed to advancements in various domains such as Statistical
Mechanics, Machine Learning, and Ecology. Its resultant solutions have served
as a catalyst, facilitating researchers in mapping their empirical observations
to the acquisition of unbiased models, whilst deepening the understanding of
complex systems and phenomena. However, when we consider situations in which
the model elements are not directly observable, such as when noise or ocular
occlusion is present, possibilities arise for which standard maximum entropy
approaches may fail, as they are unable to match feature constraints. Here we
show the Principle of Uncertain Maximum Entropy as a method that both encodes
all available information in spite of arbitrarily noisy observations while
surpassing the accuracy of some ad-hoc methods. Additionally, we utilize the
output of a black-box machine learning model as input into an uncertain maximum
entropy model, resulting in a novel approach for scenarios where the
observation function is unavailable. Previous remedies either relaxed feature
constraints when accounting for observation error, given well-characterized
errors such as zero-mean Gaussian, or chose to simply select the most likely
model element given an observation. We anticipate our principle finding broad
applications in diverse fields due to generalizing the traditional maximum
entropy method with the ability to utilize uncertain observations.
Related papers
- Asymptotic quantification of entanglement with a single copy [8.056359341994941]
This paper introduces a new way of benchmarking the protocol of entanglement distillation (purification)
Instead of measuring its yield, we focus on the best error achievable.
We show this solution to be given by the reverse relative entropy of entanglement, a single-letter quantity that can be evaluated using only a single copy of a quantum state.
arXiv Detail & Related papers (2024-08-13T17:57:59Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference [69.24516189971929]
In this paper, we introduce a new type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE)
While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network.
arXiv Detail & Related papers (2024-03-16T02:07:45Z) - The Principle of Minimum Pressure Gradient: An Alternative Basis for
Physics-Informed Learning of Incompressible Fluid Mechanics [0.0]
The proposed approach uses the principle of minimum pressure gradient combined with the continuity constraint to train a neural network and predict the flow field in incompressible fluids.
We show that it reduces the computational time per training epoch when compared to the conventional approach.
arXiv Detail & Related papers (2024-01-15T06:12:22Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - IRL with Partial Observations using the Principle of Uncertain Maximum
Entropy [8.296684637620553]
We introduce the principle of uncertain maximum entropy and present an expectation-maximization based solution.
We experimentally demonstrate the improved robustness to noisy data offered by our technique in a maximum causal entropy inverse reinforcement learning domain.
arXiv Detail & Related papers (2022-08-15T03:22:46Z) - Notes on Generalizing the Maximum Entropy Principle to Uncertain Data [0.0]
We generalize the principle of maximum entropy for computing a distribution with the least amount of information possible.
We show that our technique generalizes the principle of maximum entropy and latent maximum entropy.
We discuss a generally applicable regularization technique for adding error terms to feature expectation constraints in the event of limited data.
arXiv Detail & Related papers (2021-09-09T19:43:28Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Density Fixing: Simple yet Effective Regularization Method based on the
Class Prior [2.3859169601259347]
We propose a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning.
Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence.
arXiv Detail & Related papers (2020-07-08T04:58:22Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.