Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines
- URL: http://arxiv.org/abs/2406.02648v1
- Date: Tue, 4 Jun 2024 14:16:52 GMT
- Title: Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines
- Authors: Vojtech Halenka, Ahmed K. Kadhim, Paul F. A. Clarke, Bimal Bhattarai, Rupsa Saha, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen,
- Abstract summary: We propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data.
Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM.
We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning.
- Score: 12.619567138333492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.
Related papers
- TranX-Adapter: Bridging Artifacts and Semantics within MLLMs for Robust AI-generated Image Detection [70.42796551833946]
incorporating texture-level artifact features alongside semantic features into multimodal large language models (MLLMs) can enhance their AIGI detection capability.<n>We propose a lightweight fusion adapter, TranX-Adapter, which integrates a Task-aware Optimal-Transport Fusion.<n>Experiments on standard AIGI detection benchmarks upon several advanced MLLMs, show that our TranX-Adapter brings consistent and significant improvements.
arXiv Detail & Related papers (2026-02-25T09:22:46Z) - Scaling Spatial Reasoning in MLLMs through Programmatic Data Synthesis [8.60591720958037]
Vision-Language Models (VLMs) are scalable but structurally rigid, while manual annotation is linguistically diverse but unscalable.<n>We introduce SP-RITE, a novel framework that overcomes this dilemma leveraging simulators and large models.<n>We have curated a dataset encompassing 3 simulators, 11k+ scenes, and 300k+ image/video instruction-tuning pairs.<n>We demonstrate that a VLM trained on our data achieves significant performance gains on multiple spatial benchmarks.
arXiv Detail & Related papers (2025-12-18T06:30:08Z) - TM-UNet: Token-Memory Enhanced Sequential Modeling for Efficient Medical Image Segmentation [8.178014138307288]
TM-UNet is a novel lightweight framework that integrates token sequence modeling with an efficient memory mechanism for efficient medical segmentation.<n>Our MSTM block acts as a dynamic knowledge store that captures long-range dependencies with linear complexity.<n>Extensive experiments demonstrate that TM-UNet outperforms state-of-the-art methods across diverse medical segmentation tasks with substantially reduced computation cost.
arXiv Detail & Related papers (2025-11-15T15:49:30Z) - HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models [50.31704374968706]
Multi-modal large language models (MLLMs) have emerged as a transformative approach for aligning visual and textual understanding.<n>They typically require extremely high computational resources for training to achieve cross-modal alignment at multi-granularity levels.<n>We argue that a key source of this inefficiency lies in the vision encoders they widely equip with, e.g., CLIP and SAM, which lack the alignment with language at multi-granularity levels.
arXiv Detail & Related papers (2025-10-23T08:16:44Z) - Soft Graph Transformer for MIMO Detection [23.616336786063552]
The Soft Graph Transformer (SGT) is a soft-input-soft-output neural architecture designed for Maximum Likelihood (ML) detection.<n>SGT addresses these limitations by combining self-attention, which encodes contextual dependencies within symbol and constraint subgraphs, with graph-aware cross-attention, which performs structured message passing across subgraphs.<n> Experiments demonstrate that SGT achieves near-ML performance and offers a flexible and interpretable framework for receiver systems that leverage soft priors.
arXiv Detail & Related papers (2025-09-16T05:42:45Z) - Text to Band Gap: Pre-trained Language Models as Encoders for Semiconductor Band Gap Prediction [6.349503549199403]
In this study, we explore the use of a transformer-based language model as an encoder to predict the band gaps of semiconductor materials.
We generate material descriptions in two formats: formatted strings combining features and natural language text generated using the ChatGPT API.
We demonstrate that the RoBERTa model, pre-trained on natural language processing tasks, performs effectively as an encoder for prediction tasks.
arXiv Detail & Related papers (2025-01-07T00:56:26Z) - Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model [77.50732023411811]
We propose a text-guided multi-property molecular optimization method utilizing transformer-based diffusion language model (TransDLM)
TransDLM leverages standardized chemical nomenclature as semantic representations of molecules and implicitly embeds property requirements into textual descriptions.
Our approach surpasses state-of-the-art methods in optimizing molecular structural similarity and enhancing chemical properties on the benchmark dataset.
arXiv Detail & Related papers (2024-10-17T14:30:27Z) - Learning local equivariant representations for quantum operators [7.747597014044332]
We introduce a novel deep learning model, SLEM, for predicting multiple quantum operators.
SLEM achieves state-of-the-art accuracy while dramatically improving computational efficiency.
We demonstrate SLEM's capabilities across diverse 2D and 3D materials, achieving high accuracy even with limited training data.
arXiv Detail & Related papers (2024-07-08T15:55:12Z) - LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models.
We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization.
Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z) - Sliceformer: Make Multi-head Attention as Simple as Sorting in
Discriminative Tasks [32.33355192614434]
We propose an effective and efficient surrogate of the Transformer, called Sliceformer.
Our Sliceformer replaces the classic MHA mechanism with an extremely simple slicing-sorting'' operation.
Our Sliceformer achieves comparable or better performance with lower memory cost and faster speed than the Transformer and its variants.
arXiv Detail & Related papers (2023-10-26T14:43:07Z) - Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers [71.32827362323205]
We propose a new class of linear Transformers calledLearner-Transformers (Learners)
They incorporate a wide range of relative positional encoding mechanisms (RPEs)
These include regular RPE techniques applied for sequential data, as well as novel RPEs operating on geometric data embedded in higher-dimensional Euclidean spaces.
arXiv Detail & Related papers (2023-02-03T18:57:17Z) - Composable Text Controls in Latent Space with ODEs [97.12426987887021]
This paper proposes a new efficient approach for composable text operations in the compact latent space of text.
By connecting pretrained LMs to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences.
Experiments show that composing those operators within our approach manages to generate or edit high-quality text.
arXiv Detail & Related papers (2022-08-01T06:51:45Z) - Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval [60.457378374671656]
Tevatron is a dense retrieval toolkit optimized for efficiency, flexibility, and code simplicity.
We show how Tevatron's flexible design enables easy generalization across datasets, model architectures, and accelerator platforms.
arXiv Detail & Related papers (2022-03-11T05:47:45Z) - HyperNP: Interactive Visual Exploration of Multidimensional Projection
Hyperparameters [61.354362652006834]
HyperNP is a scalable method that allows for real-time interactive exploration of projection methods by training neural network approximations.
We evaluate the performance of the HyperNP across three datasets in terms of performance and speed.
arXiv Detail & Related papers (2021-06-25T17:28:14Z) - Demystifying BERT: Implications for Accelerator Design [4.80595971865854]
We focus on BERT, one of the most popular NLP transfer learning algorithms, to identify how its algorithmic behavior can guide future accelerator design.
We characterize compute-intensive BERT computations and discuss software and possible hardware mechanisms to further optimize these computations.
Overall, our analysis identifies holistic solutions to optimize systems for BERT-like models.
arXiv Detail & Related papers (2021-04-14T01:06:49Z) - Multilinear Compressive Learning with Prior Knowledge [106.12874293597754]
Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system.
Key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task.
In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable?
arXiv Detail & Related papers (2020-02-17T19:06:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.