Efficient first-order algorithms for large-scale, non-smooth maximum
entropy models with application to wildfire science
- URL: http://arxiv.org/abs/2403.06816v1
- Date: Mon, 11 Mar 2024 15:33:55 GMT
- Title: Efficient first-order algorithms for large-scale, non-smooth maximum
entropy models with application to wildfire science
- Authors: Gabriel P. Langlois, Jatan Buch and J\'er\^ome Darbon
- Abstract summary: We present novel algorithms for training large-scale, non-smooth Maxent models.
Our proposed first-order algorithms leverage the Kullback-Leibler divergence to train large-scale and non-smooth Maxent models efficiently.
Our numerical results show that our algorithms outperform the state of the arts by one order of magnitude.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Maximum entropy (Maxent) models are a class of statistical models that use
the maximum entropy principle to estimate probability distributions from data.
Due to the size of modern data sets, Maxent models need efficient optimization
algorithms to scale well for big data applications. State-of-the-art algorithms
for Maxent models, however, were not originally designed to handle big data
sets; these algorithms either rely on technical devices that may yield
unreliable numerical results, scale poorly, or require smoothness assumptions
that many practical Maxent models lack. In this paper, we present novel
optimization algorithms that overcome the shortcomings of state-of-the-art
algorithms for training large-scale, non-smooth Maxent models. Our proposed
first-order algorithms leverage the Kullback-Leibler divergence to train
large-scale and non-smooth Maxent models efficiently. For Maxent models with
discrete probability distribution of $n$ elements built from samples, each
containing $m$ features, the stepsize parameters estimation and iterations in
our algorithms scale on the order of $O(mn)$ operations and can be trivially
parallelized. Moreover, the strong $\ell_{1}$ convexity of the
Kullback--Leibler divergence allows for larger stepsize parameters, thereby
speeding up the convergence rate of our algorithms. To illustrate the
efficiency of our novel algorithms, we consider the problem of estimating
probabilities of fire occurrences as a function of ecological features in the
Western US MTBS-Interagency wildfire data set. Our numerical results show that
our algorithms outperform the state of the arts by one order of magnitude and
yield results that agree with physical models of wildfire occurrence and
previous statistical analyses of wildfire drivers.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Efficient and Accurate Learning of Mixtures of Plackett-Luce Models [5.216020588360421]
Mixture models of Plackett-Luce (PL) are an active research area of both theoretical and practical significance.
We propose an algorithm that can provide a provably accurate initial estimate and an EM algorithm that maximizes the true log-likelihood function efficiently.
arXiv Detail & Related papers (2023-02-10T16:00:40Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Improving Computational Complexity in Statistical Models with
Second-Order Information [32.64382185990981]
We study the normalized gradient descent (NormGD) algorithm for solving parameter estimation in parametric statistical models.
We demonstrate that the NormGD algorithm achieves the optimal overall computational complexity $mathcalO(n)$ to reach the final statistical radius.
This computational complexity is cheaper than that of the fixed step-size gradient descent algorithm.
arXiv Detail & Related papers (2022-02-09T01:32:50Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization
without Compounding Errors [10.906666680425754]
We propose a Dyna-style model-based reinforcement learning algorithm, which we called Maximum Entropy Model Rollouts (MEMR)
To eliminate the compounding errors, we only use our model to generate single-step rollouts.
arXiv Detail & Related papers (2020-06-08T21:38:15Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - On a computationally-scalable sparse formulation of the multidimensional
and non-stationary maximum entropy principle [0.0]
We derive a non-stationary formulation of the MaxEnt-principle and show that its solution can be approximated through a numerical maximisation of the sparse constrained optimization problem with regularization.
We show that all of the considered seven major financial benchmark time series are better described by conditionally memoryless MaxEnt-models.
This analysis also reveals a sparse network of statistically-significant temporal relations for the positive and negative latent variance changes among different markets.
arXiv Detail & Related papers (2020-05-07T05:22:46Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z) - Expected Information Maximization: Using the I-Projection for Mixture
Density Estimation [22.096148237257644]
Modelling highly multi-modal data is a challenging problem in machine learning.
We present a new algorithm called Expected Information Maximization (EIM) for computing the I-projection.
We show that our algorithm is much more effective in computing the I-projection than recent GAN approaches.
arXiv Detail & Related papers (2020-01-23T17:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.