End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning
- URL: http://arxiv.org/abs/2505.13462v1
- Date: Mon, 05 May 2025 13:50:29 GMT
- Title: End-to-end fully-binarized network design: from Generic Learned Thermometer to Block Pruning
- Authors: Thien Nguyen, William Guicquero,
- Abstract summary: This article introduces Generic Learned Thermometer (GLT), an encoding technique to improve input data representation for Binary Neural Network (BNN)<n>We show that GLT brings versatility to the BNN by intrinsically performing global tone mapping, enabling significant accuracy gains in practice.
- Score: 8.28720658988688
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing works on Binary Neural Network (BNN) mainly focus on model's weights and activations while discarding considerations on the input raw data. This article introduces Generic Learned Thermometer (GLT), an encoding technique to improve input data representation for BNN, relying on learning non linear quantization thresholds. This technique consists in multiple data binarizations which can advantageously replace a conventional Analog to Digital Conversion (ADC) that uses natural binary coding. Additionally, we jointly propose a compact topology with light-weight grouped convolutions being trained thanks to block pruning and Knowledge Distillation (KD), aiming at reducing furthermore the model size so as its computational complexity. We show that GLT brings versatility to the BNN by intrinsically performing global tone mapping, enabling significant accuracy gains in practice (demonstrated by simulations on the STL-10 and VWW datasets). Moreover, when combining GLT with our proposed block-pruning technique, we successfully achieve lightweight (under 1Mb), fully-binarized models with limited accuracy degradation while being suitable for in-sensor always-on inference use cases.
Related papers
- BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN)<n>We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations.<n>Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z) - BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Algorithm-Hardware Co-Design of Distribution-Aware Logarithmic-Posit Encodings for Efficient DNN Inference [4.093167352780157]
We introduce Logarithmic Posits (LP), an adaptive, hardware-friendly data type inspired by posits.
We also develop a novel genetic-algorithm based framework, LP Quantization (LPQ), to find optimal layer-wise LP parameters.
arXiv Detail & Related papers (2024-03-08T17:28:49Z) - GSB: Group Superposition Binarization for Vision Transformer with
Limited Training Samples [46.025105938192624]
Vision Transformer (ViT) has performed remarkably in various computer vision tasks.
ViT usually suffers from serious overfitting problems with a relatively limited number of training samples.
We propose a novel model binarization technique, called Group Superposition Binarization (GSB)
arXiv Detail & Related papers (2023-05-13T14:48:09Z) - Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Signed Binary Weight Networks [17.07866119979333]
Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization.
We propose a new method called signed-binary networks to improve efficiency further.
Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to 69% sparsity.
arXiv Detail & Related papers (2022-11-25T00:19:21Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling
and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks.
To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings.
We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z) - Physics-aware deep neural networks for surrogate modeling of turbulent
natural convection [0.0]
We investigate the use of PINNs surrogate modeling for turbulent Rayleigh-B'enard convection flows.
We show how it comes to play as a regularization close to the training boundaries which are zones of poor accuracy for standard PINNs.
The predictive accuracy of the surrogate over the entire half a billion DNS coordinates yields errors for all flow variables ranging between [0.3% -- 4%] in the relative L 2 norm.
arXiv Detail & Related papers (2021-03-05T09:48:57Z) - BiSNN: Training Spiking Neural Networks with Binary Weights via Bayesian
Learning [37.376989855065545]
Spiking Neural Networks (SNNs) are biologically inspired, dynamic, event-driven models that enhance energy efficiency.
An SNN model is introduced that combines the benefits of temporally sparse binary activations and of binary weights.
Experiments validate the performance loss with respect to full-precision implementations.
arXiv Detail & Related papers (2020-12-15T14:06:36Z) - Compressing deep neural networks on FPGAs to binary and ternary
precision with HLS4ML [13.325670094073383]
We present the implementation of binary and ternary neural networks in the hls4ml library.
We discuss the trade-off between model accuracy and resource consumption.
The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources.
arXiv Detail & Related papers (2020-03-11T10:46:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.