ReducedLUT: Table Decomposition with "Don't Care" Conditions
- URL: http://arxiv.org/abs/2412.18579v2
- Date: Tue, 31 Dec 2024 14:01:54 GMT
- Title: ReducedLUT: Table Decomposition with "Don't Care" Conditions
- Authors: Oliver Cassidy, Marta Andronic, Samuel Coward, George A. Constantinides,
- Abstract summary: This paper introduces ReducedLUT, a novel method to reduce the footprint of the LUTs by injecting don't cares into the compression process.<n>We demonstrate a particular application to machine learning; by replacing unobserved patterns within the training data of neural network models with don't cares, we enable greater compression with minimal model accuracy degradation.
- Score: 2.2061683015812026
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Lookup tables (LUTs) are frequently used to efficiently store arrays of precomputed values for complex mathematical computations. When used in the context of neural networks, these functions exhibit a lack of recognizable patterns which presents an unusual challenge for conventional logic synthesis techniques. Several approaches are known to break down a single large lookup table into multiple smaller ones that can be recombined. Traditional methods, such as plain tabulation, piecewise linear approximation, and multipartite table methods, often yield inefficient hardware solutions when applied to LUT-based NNs. This paper introduces ReducedLUT, a novel method to reduce the footprint of the LUTs by injecting don't cares into the compression process. This additional freedom introduces more self-similarities which can be exploited using known decomposition techniques. We then demonstrate a particular application to machine learning; by replacing unobserved patterns within the training data of neural network models with don't cares, we enable greater compression with minimal model accuracy degradation. In practice, we achieve up to $1.63\times$ reduction in Physical LUT utilization, with a test accuracy drop of no more than $0.01$ accuracy points.
Related papers
- Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.
LCD can distort the global distribution over strings, sampling tokens based only on local information.
We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z) - NeuraLUT-Assemble: Hardware-aware Assembling of Sub-Neural Networks for Efficient LUT Inference [2.7086888205833968]
Efficient neural networks (NNs) leveraging lookup tables (LUTs) have demonstrated significant potential for emerging AI applications.
Existing LUT-based designs suffer from accuracy degradation due to the large fan-in required by neurons being limited by the exponential scaling of LUT resources with input width.
We present NeuraLUT-Assemble, a novel framework that addresses these limitations by combining mixed-precision techniques with the assembly of larger neurons from smaller units.
arXiv Detail & Related papers (2025-04-01T09:52:38Z) - ReFill: Reinforcement Learning for Fill-In Minimization [3.9134031118910264]
Minimizing fill-in is NP-hard, and existing minimizations like Minimum Degree and Nested Dissection offer limited adaptability.
We introduce textitReFill, a reinforcement learning framework enhanced by Graph Neural Networks (GNNs) to learn adaptive ordering strategies for fill-in.
arXiv Detail & Related papers (2025-01-27T15:20:41Z) - CORE: Common Random Reconstruction for Distributed Optimization with
Provable Low Communication Complexity [110.50364486645852]
Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers.
We propose Common Om REOm, which can be used to compress information transmitted between machines.
arXiv Detail & Related papers (2023-09-23T08:45:27Z) - Improving Probabilistic Bisimulation for MDPs Using Machine Learning [0.0]
We propose a new technique to partition the state space of a given model to its probabilistic bisimulation classes.
The approach can decrease significantly the running time compared to state-of-the-art tools.
arXiv Detail & Related papers (2023-07-30T12:58:12Z) - Efficient Graph Laplacian Estimation by Proximal Newton [12.05527862797306]
A graph learning problem can be formulated as a maximum likelihood estimation (MLE) of the precision matrix.
We develop a second-order approach to obtain an efficient solver utilizing several algorithmic features.
arXiv Detail & Related papers (2023-02-13T15:13:22Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - An Information Theory-inspired Strategy for Automatic Network Pruning [88.51235160841377]
Deep convolution neural networks are well known to be compressed on devices with resource constraints.
Most existing network pruning methods require laborious human efforts and prohibitive computation resources.
We propose an information theory-inspired strategy for automatic model compression.
arXiv Detail & Related papers (2021-08-19T07:03:22Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal
Memory [7.1975923901054575]
Backpropagation algorithm requires a memory footprint proportional to the number of uses times the network size.
Otherwise, the adjoint method obtains a gradient by a numerical integration backward in time with a minimal memory footprint.
This study proposes the symplectic adjoint method, which obtains the exact gradient with a footprint proportional to the number of uses plus the network size.
arXiv Detail & Related papers (2021-02-19T05:47:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.