The Principle of Uncertain Maximum Entropy
- URL: http://arxiv.org/abs/2305.09868v2
- Date: Mon, 19 Jun 2023 19:46:32 GMT
- Title: The Principle of Uncertain Maximum Entropy
- Authors: Kenneth Bogert, Matthew Kothe
- Abstract summary: The principle of maximum entropy has contributed to advancements in various domains such as Statistical Mechanics, Machine Learning, and Ecology.
Here we show the Principle of Uncertain Maximum Entropy as a method that both encodes all available information in spite of arbitrarily noisy observations.
We utilize the output of a black-box machine learning model as input into an uncertain maximum entropy model, resulting in a novel approach for scenarios where the observation function is unavailable.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The principle of maximum entropy, as introduced by Jaynes in information
theory, has contributed to advancements in various domains such as Statistical
Mechanics, Machine Learning, and Ecology. Its resultant solutions have served
as a catalyst, facilitating researchers in mapping their empirical observations
to the acquisition of unbiased models, whilst deepening the understanding of
complex systems and phenomena. However, when we consider situations in which
the model elements are not directly observable, such as when noise or ocular
occlusion is present, possibilities arise for which standard maximum entropy
approaches may fail, as they are unable to match feature constraints. Here we
show the Principle of Uncertain Maximum Entropy as a method that both encodes
all available information in spite of arbitrarily noisy observations while
surpassing the accuracy of some ad-hoc methods. Additionally, we utilize the
output of a black-box machine learning model as input into an uncertain maximum
entropy model, resulting in a novel approach for scenarios where the
observation function is unavailable. Previous remedies either relaxed feature
constraints when accounting for observation error, given well-characterized
errors such as zero-mean Gaussian, or chose to simply select the most likely
model element given an observation. We anticipate our principle finding broad
applications in diverse fields due to generalizing the traditional maximum
entropy method with the ability to utilize uncertain observations.
Related papers
- Asymptotic quantification of entanglement with a single copy [8.056359341994941]
This paper introduces a new way of benchmarking the protocol of entanglement distillation (purification)
Instead of measuring its yield, we focus on the best error achievable.
We show this solution to be given by the reverse relative entropy of entanglement, a single-letter quantity that can be evaluated using only a single copy of a quantum state.
arXiv Detail & Related papers (2024-08-13T17:57:59Z) - On Maximum Entropy Linear Feature Inversion [7.1795069620810805]
We revisit the classical problem of inverting dimension-reducing linear mappings using the maximum entropy criterion.
We propose a new unified approach that not only specializes to the existing approaches, but offers solutions to new cases.
arXiv Detail & Related papers (2024-07-19T09:52:18Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference [69.24516189971929]
In this paper, we introduce a new type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE)
While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network.
arXiv Detail & Related papers (2024-03-16T02:07:45Z) - The Principle of Minimum Pressure Gradient: An Alternative Basis for
Physics-Informed Learning of Incompressible Fluid Mechanics [0.0]
The proposed approach uses the principle of minimum pressure gradient combined with the continuity constraint to train a neural network and predict the flow field in incompressible fluids.
We show that it reduces the computational time per training epoch when compared to the conventional approach.
arXiv Detail & Related papers (2024-01-15T06:12:22Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - IRL with Partial Observations using the Principle of Uncertain Maximum
Entropy [8.296684637620553]
We introduce the principle of uncertain maximum entropy and present an expectation-maximization based solution.
We experimentally demonstrate the improved robustness to noisy data offered by our technique in a maximum causal entropy inverse reinforcement learning domain.
arXiv Detail & Related papers (2022-08-15T03:22:46Z) - Principled Knowledge Extrapolation with GANs [92.62635018136476]
We study counterfactual synthesis from a new perspective of knowledge extrapolation.
We show that an adversarial game with a closed-form discriminator can be used to address the knowledge extrapolation problem.
Our method enjoys both elegant theoretical guarantees and superior performance in many scenarios.
arXiv Detail & Related papers (2022-05-21T08:39:42Z) - Notes on Generalizing the Maximum Entropy Principle to Uncertain Data [0.0]
We generalize the principle of maximum entropy for computing a distribution with the least amount of information possible.
We show that our technique generalizes the principle of maximum entropy and latent maximum entropy.
We discuss a generally applicable regularization technique for adding error terms to feature expectation constraints in the event of limited data.
arXiv Detail & Related papers (2021-09-09T19:43:28Z) - Invariance Principle Meets Information Bottleneck for
Out-of-Distribution Generalization [77.24152933825238]
We show that for linear classification tasks we need stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible.
We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not.
arXiv Detail & Related papers (2021-06-11T20:42:27Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Dimension Free Generalization Bounds for Non Linear Metric Learning [61.193693608166114]
We provide uniform generalization bounds for two regimes -- the sparse regime, and a non-sparse regime.
We show that by relying on a different, new property of the solutions, it is still possible to provide dimension free generalization guarantees.
arXiv Detail & Related papers (2021-02-07T14:47:00Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Integrable Nonparametric Flows [5.9774834479750805]
We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a probability distribution.
This reverses the conventional task of normalizing flows.
We discuss potential applications to problems in quantum Monte Carlo and machine learning.
arXiv Detail & Related papers (2020-12-03T16:19:52Z) - Density Fixing: Simple yet Effective Regularization Method based on the
Class Prior [2.3859169601259347]
We propose a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning.
Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence.
arXiv Detail & Related papers (2020-07-08T04:58:22Z) - Learning the Truth From Only One Side of the Story [58.65439277460011]
We focus on generalized linear models and show that without adjusting for this sampling bias, the model may converge suboptimally or even fail to converge to the optimal solution.
We propose an adaptive approach that comes with theoretical guarantees and show that it outperforms several existing methods empirically.
arXiv Detail & Related papers (2020-06-08T18:20:28Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.