An Algorithm for Routing Vectors in Sequences
- URL: http://arxiv.org/abs/2211.11754v2
- Date: Wed, 23 Nov 2022 14:16:59 GMT
- Title: An Algorithm for Routing Vectors in Sequences
- Authors: Franz A. Heinsen
- Abstract summary: We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size.
Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a routing algorithm that takes a sequence of vectors and computes
a new sequence with specified length and vector size. Each output vector
maximizes "bang per bit," the difference between a net benefit to use and net
cost to ignore data, by better predicting the input vectors. We describe output
vectors as geometric objects, as latent variables that assign credit, as query
states in a model of associative memory, and as agents in a model of a Society
of Mind. We implement the algorithm with optimizations that reduce parameter
count, computation, and memory use by orders of magnitude, enabling us to route
sequences of greater length than previously possible. We evaluate our
implementation on natural language and visual classification tasks, obtaining
competitive or state-of-the-art accuracy and end-to-end credit assignments that
are interpretable.
Related papers
- Approximating Metric Magnitude of Point Sets [4.522729058300309]
Metric magnitude is a measure of the "size" of point clouds with many desirable geometric properties.
It has been adapted to various mathematical contexts and recent work suggests that it can enhance machine learning and optimization algorithms.
In this paper, we study the magnitude problem, and show efficient ways of approximating it. We show that it can be cast as a convex optimization problem, but not as a submodular optimization.
The paper describes two new algorithms - an iterative approximation algorithm that converges fast and is accurate, and a subset selection method that makes the computation even faster.
arXiv Detail & Related papers (2024-09-06T17:15:28Z) - Knowledge Composition using Task Vectors with Learned Anisotropic Scaling [51.4661186662329]
We introduce aTLAS, an algorithm that linearly combines parameter blocks with different learned coefficients, resulting in anisotropic scaling at the task vector level.
We show that such linear combinations explicitly exploit the low intrinsicity of pre-trained models, with only a few coefficients being the learnable parameters.
We demonstrate the effectiveness of our method in task arithmetic, few-shot recognition and test-time adaptation, with supervised or unsupervised objectives.
arXiv Detail & Related papers (2024-07-03T07:54:08Z) - Self-Attention Based Semantic Decomposition in Vector Symbolic Architectures [6.473177443214531]
We introduce a new variant of the resonator network, based on self-attention based update rules in iterative search problem.
Our algorithm enables a larger capacity for associative memory, enabling applications in many tasks like perception based pattern recognition, scene decomposition, and object reasoning.
arXiv Detail & Related papers (2024-03-20T00:37:19Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - VAPI: Vectorization of Algorithm for Performance Improvement [5.835939416417458]
Vectorization is a technique for converting an algorithm, which operates on a single value at a time to one that operates on a collection of values at a time to execute rapidly.
The vectorization technique also operates by replacing multiple iterations into a single operation, which improves the algorithm's performance in speed.
The objective of this study is to use the vectorization technique on one of the metaheuristic algorithms and compare the results of the vectorized algorithm with the algorithm which is non-vectorized.
arXiv Detail & Related papers (2023-07-19T11:23:03Z) - Discovering Non-monotonic Autoregressive Orderings with Variational
Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data.
We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass.
Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Generalized Learning Vector Quantization for Classification in
Randomized Neural Networks and Hyperdimensional Computing [4.4886210896619945]
We propose a modified RVFL network that avoids computationally expensive matrix operations during training.
The proposed approach achieved state-of-the-art accuracy on a collection of datasets from the UCI Machine Learning Repository.
arXiv Detail & Related papers (2021-06-17T21:17:17Z) - Metalearning: Sparse Variable-Structure Automata [0.0]
We propose a metalearning approach to increase the number of basis vectors used in dynamic sparse coding vectors on the fly.
An actor-critic algorithm is deployed to automatically choose an appropriate dimension for feature regarding the required level of accuracy.
arXiv Detail & Related papers (2021-01-30T21:32:23Z) - Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies [60.285091454321055]
We design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix.
On text classification, language modeling, and movie recommendation benchmarks, we show that ANT is particularly suitable for large vocabulary sizes.
arXiv Detail & Related papers (2020-03-18T13:07:51Z) - Variance Reduction with Sparse Gradients [82.41780420431205]
Variance reduction methods such as SVRG and SpiderBoost use a mixture of large and small batch gradients.
We introduce a new sparsity operator: The random-top-k operator.
Our algorithm consistently outperforms SpiderBoost on various tasks including image classification, natural language processing, and sparse matrix factorization.
arXiv Detail & Related papers (2020-01-27T08:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.