Branches of a Tree: Taking Derivatives of Programs with Discrete and
Branching Randomness in High Energy Physics
- URL: http://arxiv.org/abs/2308.16680v1
- Date: Thu, 31 Aug 2023 12:32:34 GMT
- Title: Branches of a Tree: Taking Derivatives of Programs with Discrete and
Branching Randomness in High Energy Physics
- Authors: Michael Kagan and Lukas Heinrich
- Abstract summary: We discuss several possible gradient estimation strategies, including the recent AD method, and compare them in simplified detector design experiments.
In doing so we develop, to the best of our knowledge, the first fully differentiable branching program.
- Score: 1.0587959762260988
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose to apply several gradient estimation techniques to enable the
differentiation of programs with discrete randomness in High Energy Physics.
Such programs are common in High Energy Physics due to the presence of
branching processes and clustering-based analysis. Thus differentiating such
programs can open the way for gradient based optimization in the context of
detector design optimization, simulator tuning, or data analysis and
reconstruction optimization. We discuss several possible gradient estimation
strategies, including the recent Stochastic AD method, and compare them in
simplified detector design experiments. In doing so we develop, to the best of
our knowledge, the first fully differentiable branching program.
Related papers
- Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - A theoretical and empirical study of new adaptive algorithms with
additional momentum steps and shifted updates for stochastic non-convex
optimization [0.0]
It is thought that adaptive optimization algorithms represent the key pillar behind the of the Learning field.
In this paper we introduce adaptive momentum techniques for different non-smooth objective problems.
arXiv Detail & Related papers (2021-10-16T09:47:57Z) - Evolutionary Algorithms in Approximate Computing: A Survey [0.0]
This paper deals with evolutionary approximation as one of the popular approximation methods.
The paper provides the first survey of evolutionary approximation (EA)-based approaches applied in the context of approximate computing.
arXiv Detail & Related papers (2021-08-16T10:17:26Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - First-Order Methods for Convex Optimization [2.578242050187029]
First-order methods have the potential to provide low accuracy solutions at low computational complexity.
We give complete proofs for various key results, and highlight the unifying aspects of several optimization algorithms.
arXiv Detail & Related papers (2021-01-04T13:03:38Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - A Dynamical Systems Approach for Convergence of the Bayesian EM
Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based.
The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM)
We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.