Related papers: HEMP: High-order Entropy Minimization for neural network comPression

HEMP: High-order Entropy Minimization for neural network comPression

URL: http://arxiv.org/abs/2107.05298v1
Date: Mon, 12 Jul 2021 10:17:53 GMT
Title: HEMP: High-order Entropy Minimization for neural network comPression
Authors: Enzo Tartaglione, St\'ephane Lathuili\`ere, Attilio Fiandrotti, Marco Cagnazzo, Marco Grangetto
Abstract summary: We formulate the entropy of a quantized artificial neural network as a differentiable function that can be plugged as a regularization term into the cost function minimized by descent. We show that HEMP is able to work in synergy with other approaches aiming at pruning or quantizing the model itself, delivering significant benefits in terms of storage size compressibility without harming the model's performance.
Score: 20.448617917261874
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We formulate the entropy of a quantized artificial neural network as a differentiable function that can be plugged as a regularization term into the cost function minimized by gradient descent. Our formulation scales efficiently beyond the first order and is agnostic of the quantization scheme. The network can then be trained to minimize the entropy of the quantized parameters, so that they can be optimally compressed via entropy coding. We experiment with our entropy formulation at quantizing and compressing well-known network architectures over multiple datasets. Our approach compares favorably over similar methods, enjoying the benefits of higher order entropy estimate, showing flexibility towards non-uniform quantization (we use Lloyd-max quantization), scalability towards any entropy order to be minimized and efficiency in terms of compression. We show that HEMP is able to work in synergy with other approaches aiming at pruning or quantizing the model itself, delivering significant benefits in terms of storage size compressibility without harming the model's performance.

Related papers

Entropy-Based Block Pruning for Efficient Large Language Models [81.18339597023187]
We propose an entropy-based pruning strategy to enhance efficiency while maintaining performance. Empirical analysis reveals that the entropy of hidden representations decreases in the early blocks but progressively increases across most subsequent blocks.
arXiv Detail & Related papers (2025-04-04T03:42:34Z)
An Information-Theoretic Regularizer for Lossy Neural Image Compression [20.939331919455935]
Lossy image compression networks aim to minimize the latent entropy of images while adhering to specific distortion constraints. We propose a novel structural regularization method for the neural image compression task by incorporating the negative conditional source entropy into the training objective.
arXiv Detail & Related papers (2024-11-23T05:19:27Z)
Compact Multi-Threshold Quantum Information Driven Ansatz For Strongly Interactive Lattice Spin Models [0.0]
We introduce a systematic procedure for ansatz building based on approximate Quantum Mutual Information (QMI) Our approach generates a layered-structured ansatz, where each layer's qubit pairs are selected based on their QMI values, resulting in more efficient state preparation and optimization routines. Our results show that the Multi-QIDA method reduces the computational complexity while maintaining high precision, making it a promising tool for quantum simulations in lattice spin models.
arXiv Detail & Related papers (2024-08-05T17:07:08Z)
Deep Neural Networks as Variational Solutions for Correlated Open Quantum Systems [0.0]
We show that parametrizing the density matrix directly with more powerful models can yield better variational ansatz functions. We present results for the dissipative one-dimensional transverse-field Ising model and a two-dimensional dissipative Heisenberg model.
arXiv Detail & Related papers (2024-01-25T13:41:34Z)
Quantization Aware Factorization for Deep Neural Network Compression [20.04951101799232]
decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks. A conventional post-training quantization approach applied to networks with weights yields a drop in accuracy. This motivated us to develop an algorithm that finds decomposed approximation directly with quantized factors.
arXiv Detail & Related papers (2023-08-08T21:38:02Z)
Absence of barren plateaus and scaling of gradients in the energy optimization of isometric tensor network states [0.0]
We consider energy problems for quantum many-body systems with extensive Hamiltonians and finite-range interactions. We prove that variational optimization problems for matrix product states, tree tensor networks, and the multiscale entanglement renormalization ansatz are free of barren plateaus.
arXiv Detail & Related papers (2023-03-31T22:49:49Z)
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory [79.50644650795012]
We propose a deep learning approach to solve Kohn-Sham Density Functional Theory (KS-DFT) We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity. In addition, we show that our approach enables us to explore more complex neural-based wave functions.
arXiv Detail & Related papers (2023-03-01T10:38:10Z)
Entropic Neural Optimal Transport via Diffusion Processes [105.34822201378763]
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions. Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schr"odinger Bridge problem. In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step.
arXiv Detail & Related papers (2022-11-02T14:35:13Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation [48.838691414561694]
Nonuniform-to-Uniform Quantization (N2UQ) is a method that can maintain the strong representation ability of nonuniform methods while being hardware-friendly and efficient. N2UQ outperforms state-of-the-art nonuniform quantization methods by 0.71.8% on ImageNet.
arXiv Detail & Related papers (2021-11-29T18:59:55Z)
Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences. FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)
Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks. To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor. For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.