Related papers: Banach-Tarski Embeddings and Transformers

Banach-Tarski Embeddings and Transformers

URL: http://arxiv.org/abs/2311.09387v2
Date: Tue, 21 Nov 2023 18:31:57 GMT
Title: Banach-Tarski Embeddings and Transformers
Authors: Joshua Maher
Abstract summary: We introduce a new construction of embeddings of arbitrary data structures into high dimensional vectors. These embeddings provide an interpretable model for the latent state vectors of transformers. We show that these embeddings can be decoded to the original data structure when the embedding dimension is sufficiently large.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a new construction of embeddings of arbitrary recursive data structures into high dimensional vectors. These embeddings provide an interpretable model for the latent state vectors of transformers. We demonstrate that these embeddings can be decoded to the original data structure when the embedding dimension is sufficiently large. This decoding algorithm has a natural implementation as a transformer. We also show that these embedding vectors can be manipulated directly to perform computations on the underlying data without decoding. As an example we present an algorithm that constructs the embedded parse tree of an embedded token sequence using only vector operations in embedding space.

Related papers

Unifying Structured Data as Graph for Data-to-Text Pre-Training [69.96195162337793]
Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation. We propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer.
arXiv Detail & Related papers (2024-01-02T12:23:49Z)
A Compiler from Array Programs to Vectorized Homomorphic Encryption [1.6216324006136673]
Homomorphic encryption (HE) is a practical approach to secure computation over encrypted data. We present Viaduct-HE, a compiler generates efficient vectorized HE programs. Viaduct-HE can generate both the operations and complex data layouts required for efficient HE programs.
arXiv Detail & Related papers (2023-11-10T16:00:00Z)
Fast Machine Learning Method with Vector Embedding on Orthonormal Basis and Spectral Transform [0.0]
The paper provides examples of word embedding, text chunk embedding, and image embedding, implemented in Julia language with a vector database. It also investigates unsupervised learning and supervised learning using this method, along with strategies for handling large data volumes.
arXiv Detail & Related papers (2023-10-27T18:48:54Z)
Looped Transformers as Programmable Computers [48.00010456819222]
We present a framework for using transformer networks as universal computers by programming them with specific weights and placing them in a loop. Our input sequence acts as a punchcard, consisting of instructions and memory for data read/writes. We show how this transformer, instructed by its input, can emulate a basic calculator, a basic linear algebra library, and in-context learning algorithms that employ backpropagation.
arXiv Detail & Related papers (2023-01-30T18:57:31Z)
Tracr: Compiled Transformers as a Laboratory for Interpretability [15.76027393879609]
We show how to "compile" human-readable programs into decoder-only transformer models. Our compiler, Tracr, generates models with known structure. We use it to study "superposition" in transformers that execute multi-step algorithms.
arXiv Detail & Related papers (2023-01-12T14:59:19Z)
An Algorithm for Routing Vectors in Sequences [0.0]
We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors.
arXiv Detail & Related papers (2022-11-20T16:20:45Z)
Characterizing Intrinsic Compositionality in Transformers with Tree Projections [72.45375959893218]
neural models like transformers can route information arbitrarily between different parts of their input. We show that transformers for three different tasks become more treelike over the course of training. These trees are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.
arXiv Detail & Related papers (2022-11-02T17:10:07Z)
Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages [120.74406230847904]
TP-Transformer augments the traditional Transformer architecture to include an additional component to represent structure. The second method imbues structure at the data level by segmenting the data with morphological tokenization. We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset.
arXiv Detail & Related papers (2022-08-11T22:42:24Z)
Glushkov's construction for functional subsequential transducers [91.3755431537592]
Glushkov's construction has many interesting properties and they become even more evident when applied to transducers. Special flavour of regular expressions is introduced, which can be efficiently converted to $epsilon$-free functional subsequential weighted finite state transducers.
arXiv Detail & Related papers (2020-08-05T17:09:58Z)
Sketching Transformed Matrices with Applications to Natural Language Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix. We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.