CALT: A Library for Computer Algebra with Transformer
- URL: http://arxiv.org/abs/2506.08600v1
- Date: Tue, 10 Jun 2025 09:06:41 GMT
- Title: CALT: A Library for Computer Algebra with Transformer
- Authors: Hiroshi Kera, Shun Arakawa, Yuta Sato,
- Abstract summary: We introduce Computer Algebra with Transformer (CALT), a user-friendly Python library designed to help non-experts in deep learning train models for symbolic computation tasks.
- Score: 4.069144210024564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in artificial intelligence have demonstrated the learnability of symbolic computation through end-to-end deep learning. Given a sufficient number of examples of symbolic expressions before and after the target computation, Transformer models - highly effective learners of sequence-to-sequence functions - can be trained to emulate the computation. This development opens up several intriguing challenges and new research directions, which require active contributions from the symbolic computation community. In this work, we introduce Computer Algebra with Transformer (CALT), a user-friendly Python library designed to help non-experts in deep learning train models for symbolic computation tasks.
Related papers
- Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized.
We find that these random transformers can perform a wide range of meaningful algorithmic tasks.
Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z) - On the Power of Decision Trees in Auto-Regressive Language Modeling [21.97150778373607]
Auto-regressive Decision Trees (ARDTs) have not yet been explored for language modeling.
This paper delves into both the theoretical and practical applications of ARDTs in this new context.
We show that ARDTs can compute complex functions, such as simulating automata, Turing machines, and sparse circuits.
arXiv Detail & Related papers (2024-09-27T21:25:00Z) - Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory [66.88278207591294]
We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data.
PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities.
arXiv Detail & Related papers (2024-04-18T03:03:46Z) - Auto-Regressive Next-Token Predictors are Universal Learners [17.416520406390415]
We show that even simple models such as linear next-token predictors can approximate any function efficiently computed by a Turing machine.
We also show experimentally that simple next-token predictors, such as linear networks and shallow Multi-Layer Perceptrons (MLPs), display non-trivial performance on text generation and arithmetic tasks.
arXiv Detail & Related papers (2023-09-13T14:15:03Z) - Can Transformers Learn to Solve Problems Recursively? [9.5623664764386]
This paper examines the behavior of neural networks learning algorithms relevant to programs and formal verification.
By reconstructing these algorithms, we are able to correctly predict 91 percent of failure cases for one of the approximated functions.
arXiv Detail & Related papers (2023-05-24T04:08:37Z) - Explainable AI Insights for Symbolic Computation: A case study on
selecting the variable ordering for cylindrical algebraic decomposition [0.0]
This paper explores whether using explainable AI (XAI) techniques on such machine learning models can offer new insight for symbolic computation.
We present a case study on the use of ML to select the variable ordering for cylindrical algebraic decomposition.
arXiv Detail & Related papers (2023-04-24T15:05:04Z) - Transformers Learn Shortcuts to Automata [52.015990420075944]
We find that a low-depth Transformer can represent the computations of any finite-state automaton.
We show that a Transformer with $O(log T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$.
We further investigate the brittleness of these solutions and propose potential mitigations.
arXiv Detail & Related papers (2022-10-19T17:45:48Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.