Simplifying Polylogarithms with Machine Learning
- URL: http://arxiv.org/abs/2206.04115v1
- Date: Wed, 8 Jun 2022 18:20:21 GMT
- Title: Simplifying Polylogarithms with Machine Learning
- Authors: Aur\'elien Dersy, Matthew D. Schwartz, Xiaoyuan Zhang
- Abstract summary: In many calculations relevant to particle physics, complicated combinations of polylogarithms often arise from Feynman integrals.
We consider both a reinforcement learning approach, where the identities are analogous to moves in a game, and a transformer network approach, where the problem is viewed analogously to a language-translation task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Polylogrithmic functions, such as the logarithm or dilogarithm, satisfy a
number of algebraic identities. For the logarithm, all the identities follow
from the product rule. For the dilogarithm and higher-weight classical
polylogarithms, the identities can involve five functions or more. In many
calculations relevant to particle physics, complicated combinations of
polylogarithms often arise from Feynman integrals. Although the initial
expressions resulting from the integration usually simplify, it is often
difficult to know which identities to apply and in what order. To address this
bottleneck, we explore to what extent machine learning methods can help. We
consider both a reinforcement learning approach, where the identities are
analogous to moves in a game, and a transformer network approach, where the
problem is viewed analogously to a language-translation task. While both
methods are effective, the transformer network appears more powerful and holds
promise for practical use in symbolic manipulation tasks in mathematical
physics.
Related papers
- Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention.
We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.
We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z) - Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials [29.09237503747052]
Grokking on modular addition has been known to implement Fourier representation and its calculation circuits with trigonometric identities in Transformers.
We show that the transferability among the models grokked with each operation can be only limited to specific combinations.
Some multi-task mixtures may lead to co-grokking, where grokking simultaneously happens for all the tasks.
arXiv Detail & Related papers (2024-02-26T16:48:12Z) - Transformers, parallel computation, and logarithmic depth [33.659870765923884]
We show that a constant number of self-attention layers can efficiently simulate, and be simulated by, a constant number of communication rounds of Massively Parallel Computation.
arXiv Detail & Related papers (2024-02-14T15:54:55Z) - Symbolic Equation Solving via Reinforcement Learning [9.361474110798143]
We propose a novel deep-learning interface involving a reinforcement-learning agent that operates a symbolic stack calculator.
By construction, this system is capable of exact transformations and immune to hallucination.
arXiv Detail & Related papers (2024-01-24T13:42:24Z) - CoLA: Exploiting Compositional Structure for Automatic and Efficient
Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z) - Linearity of Relation Decoding in Transformer Language Models [82.47019600662874]
Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations.
We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation.
arXiv Detail & Related papers (2023-08-17T17:59:19Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Multiset Signal Processing and Electronics [1.0152838128195467]
Multisets are intuitive extensions of the traditional concept of sets that allow repetition of elements.
Recent generalizations of multisets to real-valued functions have paved the way to a number of interesting implications and applications.
It is proposed that effective multiset operations capable of high performance self and cross-correlation can be obtained with relative simplicity in either discrete or integrated circuits.
arXiv Detail & Related papers (2021-11-13T11:50:00Z) - Statistically Meaningful Approximation: a Case Study on Approximating
Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability.
We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z) - Analyzing the Nuances of Transformers' Polynomial Simplification
Abilities [11.552059052724907]
We show that Transformers consistently struggle with numeric multiplication.
We explore two ways to mitigate this: Learning Curriculum and a Symbolic Calculator approach.
Both approaches provide significant gains over the vanilla Transformers-based baseline.
arXiv Detail & Related papers (2021-04-29T03:52:46Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.