Related papers: Simplifying Polylogarithms with Machine Learning

Simplifying Polylogarithms with Machine Learning

URL: http://arxiv.org/abs/2206.04115v1
Date: Wed, 8 Jun 2022 18:20:21 GMT
Title: Simplifying Polylogarithms with Machine Learning
Authors: Aur\'elien Dersy, Matthew D. Schwartz, Xiaoyuan Zhang
Abstract summary: In many calculations relevant to particle physics, complicated combinations of polylogarithms often arise from Feynman integrals. We consider both a reinforcement learning approach, where the identities are analogous to moves in a game, and a transformer network approach, where the problem is viewed analogously to a language-translation task.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Polylogrithmic functions, such as the logarithm or dilogarithm, satisfy a number of algebraic identities. For the logarithm, all the identities follow from the product rule. For the dilogarithm and higher-weight classical polylogarithms, the identities can involve five functions or more. In many calculations relevant to particle physics, complicated combinations of polylogarithms often arise from Feynman integrals. Although the initial expressions resulting from the integration usually simplify, it is often difficult to know which identities to apply and in what order. To address this bottleneck, we explore to what extent machine learning methods can help. We consider both a reinforcement learning approach, where the identities are analogous to moves in a game, and a transformer network approach, where the problem is viewed analogously to a language-translation task. While both methods are effective, the transformer network appears more powerful and holds promise for practical use in symbolic manipulation tasks in mathematical physics.

Related papers

Learning Compositional Functions with Transformers from Easy-to-Hard Data [63.96562216704653]
We study the learnability of the $k$-fold composition task, which requires computing an interleaved composition of $k$ input permutations and $k$ hidden permutations.<n>We show that this function class can be efficiently learned, with runtime and sample in $k$, by gradient descent on an $O(log k)$-depth transformer.
arXiv Detail & Related papers (2025-05-29T17:22:00Z)
Explicit Solution Equation for Every Combinatorial Problem via Tensor Networks: MeLoCoToN [55.2480439325792]
We show that every computation problem has an exact explicit equation that returns its solution. We present a method to obtain an equation that solves exactly any problem, both inversion, constraint satisfaction and optimization.
arXiv Detail & Related papers (2025-02-09T18:16:53Z)
Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention. We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS. We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z)
Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials [29.09237503747052]
Grokking on modular addition has been known to implement Fourier representation and its calculation circuits with trigonometric identities in Transformers. We show that the transferability among the models grokked with each operation can be only limited to specific combinations. Some multi-task mixtures may lead to co-grokking, where grokking simultaneously happens for all the tasks.
arXiv Detail & Related papers (2024-02-26T16:48:12Z)
Transformers, parallel computation, and logarithmic depth [33.659870765923884]
We show that a constant number of self-attention layers can efficiently simulate, and be simulated by, a constant number of communication rounds of Massively Parallel Computation.
arXiv Detail & Related papers (2024-02-14T15:54:55Z)
Symbolic Equation Solving via Reinforcement Learning [9.361474110798143]
We propose a novel deep-learning interface involving a reinforcement-learning agent that operates a symbolic stack calculator. By construction, this system is capable of exact transformations and immune to hallucination.
arXiv Detail & Related papers (2024-01-24T13:42:24Z)
CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA. By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z)
Linearity of Relation Decoding in Transformer Language Models [82.47019600662874]
Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations. We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation.
arXiv Detail & Related papers (2023-08-17T17:59:19Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Multiset Signal Processing and Electronics [1.0152838128195467]
Multisets are intuitive extensions of the traditional concept of sets that allow repetition of elements. Recent generalizations of multisets to real-valued functions have paved the way to a number of interesting implications and applications. It is proposed that effective multiset operations capable of high performance self and cross-correlation can be obtained with relative simplicity in either discrete or integrated circuits.
arXiv Detail & Related papers (2021-11-13T11:50:00Z)
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability. We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z)
Analyzing the Nuances of Transformers' Polynomial Simplification Abilities [11.552059052724907]
We show that Transformers consistently struggle with numeric multiplication. We explore two ways to mitigate this: Learning Curriculum and a Symbolic Calculator approach. Both approaches provide significant gains over the vanilla Transformers-based baseline.
arXiv Detail & Related papers (2021-04-29T03:52:46Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.