Named Tensor Notation
- URL: http://arxiv.org/abs/2102.13196v1
- Date: Thu, 25 Feb 2021 22:21:30 GMT
- Title: Named Tensor Notation
- Authors: David Chiang, Alexander M. Rush, Boaz Barak
- Abstract summary: We propose a notation for tensors with named axes.
It relieves the author, reader, and future implementers from the burden of keeping track of the order of axes.
It also makes it easy to extend operations on low-order tensors to higher order ones.
- Score: 117.30373263410507
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a notation for tensors with named axes, which relieves the author,
reader, and future implementers from the burden of keeping track of the order
of axes and the purpose of each. It also makes it easy to extend operations on
low-order tensors to higher order ones (e.g., to extend an operation on images
to minibatches of images, or extend the attention mechanism to multiple
attention heads). After a brief overview of our notation, we illustrate it
through several examples from modern machine learning, from building blocks
like attention and convolution to full models like Transformers and LeNet.
Finally, we give formal definitions and describe some extensions. Our proposals
build on ideas from many previous papers and software libraries. We hope that
this document will encourage more authors to use named tensors, resulting in
clearer papers and less bug-prone implementations.
The source code for this document can be found at
https://github.com/namedtensor/notation/. We invite anyone to make comments on
this proposal by submitting issues or pull requests on this repository.
Related papers
- A Sparse Tensor Generator with Efficient Feature Extraction [1.3124513975412255]
A major obstacle for research in sparse tensor operations is the deficiency of a broad-scale sparse tensor dataset.
We have developed a smart sparse tensor generator that mimics the substantial features of real sparse tensors.
The effectiveness of our generator is validated through the quality of features and the performance of decomposition.
arXiv Detail & Related papers (2024-05-08T10:28:20Z) - Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass.
In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z) - Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - An introduction to graphical tensor notation for mechanistic
interpretability [0.0]
It's often easy to get confused about which operations are happening between tensors.
The first half of this document introduces the notation and applies it to some decompositions.
The second half applies it to some existing some foundational approaches for mechanistically understanding language models.
arXiv Detail & Related papers (2024-02-02T02:56:01Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - Longtonotes: OntoNotes with Longer Coreference Chains [111.73115731999793]
We build a corpus of coreference-annotated documents of significantly longer length than what is currently available.
The resulting corpus, which we call LongtoNotes, contains documents in multiple genres of the English language with varying lengths.
We evaluate state-of-the-art neural coreference systems on this new corpus.
arXiv Detail & Related papers (2022-10-07T15:58:41Z) - Stack operation of tensor networks [10.86105335102537]
We propose a mathematically rigorous definition for the tensor network stack approach.
We illustrate the main ideas with the matrix product states based machine learning as an example.
arXiv Detail & Related papers (2022-03-28T12:45:13Z) - DMRjulia: Tensor recipes for entanglement renormalization computations [0.0]
Detailed notes on the functions included in the DMRjulia library are included here.
This document presently covers the implementation of the functions in the tensor network library for dense tensors.
arXiv Detail & Related papers (2021-11-29T13:41:59Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - Entanglement and Tensor Networks for Supervised Image Classification [0.0]
We revisit the use of tensor networks for supervised image classification using the MNIST data set of digits of handwritten.
We propose a plausible candidate state $|Sigma_ellrangle$ and investigate its entanglement properties.
We conclude that $|Sigma_ellrangle$ is so robustly entangled that it cannot be approximated by the tensor network used in that work.
arXiv Detail & Related papers (2020-07-12T20:09:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.