Related papers: MicroHD: An Accuracy-Driven Optimization of Hyperdimensional Computing Algorithms for TinyML systems

MicroHD: An Accuracy-Driven Optimization of Hyperdimensional Computing Algorithms for TinyML systems

URL: http://arxiv.org/abs/2404.00039v1
Date: Sun, 24 Mar 2024 02:45:34 GMT
Title: MicroHD: An Accuracy-Driven Optimization of Hyperdimensional Computing Algorithms for TinyML systems
Authors: Flavio Ponzina, Tajana Rosing,
Abstract summary: Hyperdimensional computing (HDC) is emerging as a promising AI approach that can effectively target TinyML applications. Previous works on HDC showed that limiting the standard 10k dimensions of the hyperdimensional space to much lower values is possible.
Score: 8.54897708375791
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperdimensional computing (HDC) is emerging as a promising AI approach that can effectively target TinyML applications thanks to its lightweight computing and memory requirements. Previous works on HDC showed that limiting the standard 10k dimensions of the hyperdimensional space to much lower values is possible, reducing even more HDC resource requirements. Similarly, other studies demonstrated that binary values can be used as elements of the generated hypervectors, leading to significant efficiency gains at the cost of some degree of accuracy degradation. Nevertheless, current optimization attempts do not concurrently co-optimize HDC hyper-parameters, and accuracy degradation is not directly controlled, resulting in sub-optimal HDC models providing several applications with unacceptable output qualities. In this work, we propose MicroHD, a novel accuracy-driven HDC optimization approach that iteratively tunes HDC hyper-parameters, reducing memory and computing requirements while ensuring user-defined accuracy levels. The proposed method can be applied to HDC implementations using different encoding functions, demonstrates good scalability for larger HDC workloads, and achieves compression and efficiency gains up to 200x when compared to baseline implementations for accuracy degradations lower than 1%.

Related papers

SAQ: Stabilizer-Aware Quantum Error Correction Decoder [8.458339111154585]
Quantum Error Correction (QEC) decoding faces a fundamental accuracy-efficiency tradeoff.<n>Recent neural decoders reduce complexity but lack the accuracy needed to compete with computationally expensive classical methods.<n>We introduce SAQ-Decoder, a framework combining transformer-based learning with constraint post-processing.
arXiv Detail & Related papers (2025-12-09T18:51:35Z)
DPQ-HD: Post-Training Compression for Ultra-Low Power Hyperdimensional Computing [6.378578005171813]
We propose a novel Post Training Compression algorithm, Decomposition-Pruning-Quantization (DPQ-HD)<n>DPQ-HD reduces computational and memory overhead by uniquely combining the above three compression techniques.<n>We demonstrate that DPQ-HD achieves up to 20-100x reduction in memory for image and graph classification tasks with only a 1-2% drop in accuracy.
arXiv Detail & Related papers (2025-05-08T16:54:48Z)
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics [64.62231094774211]
Statefuls (e.g., Adam) maintain auxiliary information even 2x the model size in order to achieve optimal convergence.<n>SOLO enables Adam-styles to maintain quantized states with precision as low as 3 bits, or even 2 bits.<n>SOLO can thus be seamlessly applied to Adam-styles, leading to substantial memory savings with minimal accuracy loss.
arXiv Detail & Related papers (2025-05-01T06:47:45Z)
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs [81.01082659623552]
Large Language Models (LLMs) have demonstrated remarkable success across various domains. Their optimization remains a significant challenge due to the complex and high-dimensional loss landscapes they inhabit.
arXiv Detail & Related papers (2025-02-24T18:42:19Z)
Sparse Gradient Compression for Fine-Tuning Large Language Models [58.44973963468691]
Fine-tuning large language models (LLMs) for downstream tasks has become increasingly crucial due to their widespread use and the growing availability of open-source models. High memory costs associated with fine-tuning remain a significant challenge, especially as models increase in size. We propose sparse compression gradient (SGC) to address these limitations.
arXiv Detail & Related papers (2025-02-01T04:18:28Z)
EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation [84.70637613266835]
EoRA is a fine-tuning-free method that augments compressed Large Language Models with low-rank matrices.<n>EoRA consistently outperforms prior training-free low rank methods in recovering the accuracy of compressed LLMs.
arXiv Detail & Related papers (2024-10-28T17:59:03Z)
Progressive Mixed-Precision Decoding for Efficient LLM Inference [49.05448842542558]
We introduce Progressive Mixed-Precision Decoding (PMPD) to address the memory-boundedness of decoding. PMPD achieves 1.4$-$12.2$times$ speedup in matrix-vector multiplications over fp16 models. Our approach delivers a throughput gain of 3.8$-$8.0$times$ over fp16 models and up to 1.54$times$ over uniform quantization approaches.
arXiv Detail & Related papers (2024-10-17T11:46:33Z)
Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders. We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z)
ThinK: Thinner Key Cache by Query-Driven Pruning [63.13363917871414]
Large Language Models (LLMs) have revolutionized the field of natural language processing, achieving unprecedented performance across a variety of applications. This paper focuses on the long-context scenario, addressing the inefficiencies in KV cache memory consumption during inference. We propose ThinK, a novel query-dependent KV cache pruning method designed to minimize attention weight loss while selectively pruning the least significant channels.
arXiv Detail & Related papers (2024-07-30T17:59:08Z)
An Encoding Framework for Binarized Images using HyperDimensional Computing [0.0]
This article proposes a novel light-weight approach to encode binarized images that preserves similarity of patterns at nearby locations. The method reaches an accuracy of 97.35% on the test set for the MNIST data set and 84.12% for the Fashion-MNIST data set.
arXiv Detail & Related papers (2023-12-01T09:34:28Z)
Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR [67.63332492134332]
We design an optimized conformer that is small enough to meet on-device restrictions and has fast inference on TPUs. Our proposed encoder can double as a strong standalone encoder in on device, and as the first part of a high-performance ASR pipeline.
arXiv Detail & Related papers (2023-03-31T23:30:48Z)
Efficient Hyperdimensional Computing [4.8915861089531205]
We develop HDC models that use binary hypervectors with dimensions orders of magnitude lower than those of state-of-the-art HDC models. For instance, on the MNIST dataset, we achieve 91.12% HDC accuracy in image classification with a dimension of only 64.
arXiv Detail & Related papers (2023-01-26T02:22:46Z)
An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems. We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z)
LeHDC: Learning-Based Hyperdimensional Computing Classifier [14.641707790969914]
We propose a new HDC framework, called LeHDC, which leverages a principled learning approach to improve the model accuracy. Experimental validation shows that LeHDC outperforms previous HDC training strategies and can improve on average the inference accuracy over 15%.
arXiv Detail & Related papers (2022-03-18T01:13:58Z)
Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices. We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT) Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z)
Efficient hyperparameter optimization by way of PAC-Bayes bound minimization [4.191847852775072]
We present an alternative objective that is equivalent to a Probably Approximately Correct-Bayes (PAC-Bayes) bound on the expected out-of-sample error. We then devise an efficient gradient-based algorithm to minimize this objective.
arXiv Detail & Related papers (2020-08-14T15:54:51Z)
SHEARer: Highly-Efficient Hyperdimensional Computing by Software-Hardware Enabled Multifold Approximation [7.528764144503429]
We propose SHEARer, an algorithm-hardware co-optimization to improve the performance and energy consumption of HD computing. SHEARer achieves an average throughput boost of 104,904x (15.7x) and energy savings of up to 56,044x (301x) compared to state-of-the-art encoding methods. We also develop a software framework that enables training HD models by emulating the proposed approximate encodings.
arXiv Detail & Related papers (2020-07-20T07:58:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.