Related papers: RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

URL: http://arxiv.org/abs/2311.05317v1
Date: Thu, 9 Nov 2023 12:25:39 GMT
Title: RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures
Authors: Anastasiia Prutianova, Alexey Zaytsev, Chung-Kuei Lee, Fengyu Sun, Ivan Koryakovskiy
Abstract summary: We propose a novel approach called RepQ, which applies quantization to re-parametrized networks. Our method is based on the insight that the test stage weights of an arbitrary re-parametrized layer can be presented as a differentiable function of trainable parameters. RepQ generalizes well to various re-parametrized models and outperforms the baseline method LSQ quantization scheme in all experiments.
Score: 3.797846371838652
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are quantization, a well-known approach for network compression, and re-parametrization, an emerging technique designed to improve model performance. Although both techniques have been studied individually, there has been limited research on their simultaneous application. To address this gap, we propose a novel approach called RepQ, which applies quantization to re-parametrized networks. Our method is based on the insight that the test stage weights of an arbitrary re-parametrized layer can be presented as a differentiable function of trainable parameters. We enable quantization-aware training by applying quantization on top of this function. RepQ generalizes well to various re-parametrized models and outperforms the baseline method LSQ quantization scheme in all experiments.

Related papers

Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
Neural Networks Trained by Weight Permutation are Universal Approximators [4.642647756403863]
We show that a permutation-based training method can guide a ReLU network to approximate one-dimensional continuous functions. The notable observations during weight permutation suggest that permutation training can provide an innovative tool for describing network learning behavior.
arXiv Detail & Related papers (2024-07-01T07:33:00Z)
RefQSR: Reference-based Quantization for Image Super-Resolution Networks [14.428652358882978]
Single image super-resolution aims to reconstruct a high-resolution image from its low-resolution observation. Deep learning-based SISR models show high performance at the expense of increased computational costs. We introduce a novel method called RefQSR that applies high-bit quantization to several representative patches and uses them as references for low-bit quantization of the rest of the patches in an image.
arXiv Detail & Related papers (2024-04-02T06:49:38Z)
Quantification using Permutation-Invariant Networks based on Histograms [47.47360392729245]
Quantification is the supervised learning task in which a model is trained to predict the prevalence of each class in a given bag of examples. This paper investigates the application of deep neural networks to tasks of quantification in scenarios where it is possible to apply a symmetric supervised approach. We propose HistNetQ, a novel neural architecture that relies on a permutation-invariant representation based on histograms.
arXiv Detail & Related papers (2024-03-22T11:25:38Z)
Alternate Training of Shared and Task-Specific Parameters for Multi-Task Neural Networks [49.1574468325115]
This paper introduces novel alternate training procedures for hard- parameter sharing Multi-Task Neural Networks (MTNNs) The proposed alternate training method updates shared and task-specific weights alternately, exploiting the multi-head architecture of the model. Empirical experiments demonstrate delayed overfitting, improved prediction, and reduced computational demands.
arXiv Detail & Related papers (2023-12-26T21:33:03Z)
Weight Re-Mapping for Variational Quantum Algorithms [54.854986762287126]
We introduce the concept of weight re-mapping for variational quantum circuits (VQCs) We employ seven distinct weight re-mapping functions to assess their impact on eight classification datasets. Our results indicate that weight re-mapping can enhance the convergence speed of the VQC.
arXiv Detail & Related papers (2023-06-09T09:42:21Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
A simple approach for quantizing neural networks [7.056222499095849]
We propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.
arXiv Detail & Related papers (2022-09-07T22:36:56Z)
Learning Representations for CSI Adaptive Quantization and Feedback [51.14360605938647]
We propose an efficient method for adaptive quantization and feedback in frequency division duplexing systems. Existing works mainly focus on the implementation of autoencoder (AE) neural networks for CSI compression. We recommend two different methods: one based on a post training quantization and the second one in which the codebook is found during the training of the AE.
arXiv Detail & Related papers (2022-07-13T08:52:13Z)
Robust Quantization: One Model to Rule Them All [13.87610199914036]
We propose a method that provides intrinsic robustness to the model against a broad range of quantization processes. Our method is motivated by theoretical arguments and enables us to store a single generic model capable of operating at various bit-widths and quantization policies.
arXiv Detail & Related papers (2020-02-18T16:14:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.