Related papers: Learning sparse auto-encoders for green AI image coding

Learning sparse auto-encoders for green AI image coding

URL: http://arxiv.org/abs/2209.04448v1
Date: Fri, 9 Sep 2022 06:31:46 GMT
Title: Learning sparse auto-encoders for green AI image coding
Authors: Cyprien Gille, Fr\'ed\'eric Guyard, Marc Antonini, and Michel Barlaud
Abstract summary: In this paper, we address the problem of lossy image compression using a CAE with a small memory footprint and low computational power usage. We propose a constrained approach and a new structured sparse learning method. Experimental results show that the $ell_1,1$ constraint provides the best structured proximal sparsity, resulting in a high reduction of memory and computational cost.
Score: 5.967279020820772
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, convolutional auto-encoders (CAE) were introduced for image coding. They achieved performance improvements over the state-of-the-art JPEG2000 method. However, these performances were obtained using massive CAEs featuring a large number of parameters and whose training required heavy computational power.\\ In this paper, we address the problem of lossy image compression using a CAE with a small memory footprint and low computational power usage. In order to overcome the computational cost issue, the majority of the literature uses Lagrangian proximal regularization methods, which are time consuming themselves.\\ In this work, we propose a constrained approach and a new structured sparse learning method. We design an algorithm and test it on three constraints: the classical $\ell_1$ constraint, the $\ell_{1,\infty}$ and the new $\ell_{1,1}$ constraint. Experimental results show that the $\ell_{1,1}$ constraint provides the best structured sparsity, resulting in a high reduction of memory and computational cost, with similar rate-distortion performance as with dense networks.

Related papers

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float [71.43026659686679]
Large Language Models (LLMs) have grown rapidly in size, creating challenges for efficient deployment on resource-constrained hardware. We introduce Dynamic-Length Float (DFloat11), a compression framework that reduces LLM size by 30% while preserving outputs that are bit-for-bit identical to the original model.
arXiv Detail & Related papers (2025-04-15T22:38:38Z)
CODA: Repurposing Continuous VAEs for Discrete Tokenization [52.58960429582813]
textbfCODA(textbfCOntinuous-to-textbfDiscrete textbfAdaptation) is a framework that decouples compression and discretization. Our approach achieves a remarkable codebook utilization of 100% and notable reconstruction FID (rFID) of $mathbf0.43$ and $mathbf1.34$ for $8 times$ and $16 times$ compression on ImageNet 256$times$ 256 benchmark.
arXiv Detail & Related papers (2025-03-22T12:59:00Z)
A Fast Quantum Image Compression Algorithm based on Taylor Expansion [0.0]
In this study, we upgrade a quantum image compression algorithm within parameterized quantum circuits. Our approach encodes image data as unitary operator parameters and applies the quantum compilation algorithm to emulate the encryption process. Experimental results on benchmark images, including Lenna and Cameraman, show that our method achieves up to 86% reduction in the number of iterations.
arXiv Detail & Related papers (2025-02-15T06:03:49Z)
Hamming Attention Distillation: Binarizing Keys and Queries for Efficient Long-Context Transformers [18.469378618426294]
We introduce Hamming Attention Distillation (HAD), a framework that binarizes keys and queries in the attention mechanism to achieve significant efficiency gains. We implement HAD in custom hardware simulations, demonstrating superior performance characteristics compared to a custom hardware implementation of standard attention.
arXiv Detail & Related papers (2025-02-03T19:24:01Z)
Masked Generative Nested Transformers with Decode Time Scaling [21.34984197218021]
In this work, we aim to address the bottleneck of inference computational efficiency in visual generation algorithms. We design a decode time model scaling schedule to utilize compute effectively, and we can cache and reuse some of the computation. Our experiments show that with almost $3times$ less compute than baseline, our model obtains competitive performance.
arXiv Detail & Related papers (2025-02-01T09:41:01Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
We propose an algorithm that enables fast and high-quality generation under arbitrary constraints. During inference, we can interchange between gradient updates computed on the noisy image and updates computed on the final, clean image. Our approach produces results that rival or surpass the state-of-the-art training-free inference approaches.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders. We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z)
SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference. We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit. Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z)
Rethinking the Paradigm of Content Constraints in Unpaired Image-to-Image Translation [9.900050049833986]
We propose EnCo, a simple but efficient way to maintain the content by constraining the representational similarity in the latent space of patch-level features. For the similarity function, we use a simple MSE loss instead of contrastive loss, which is currently widely used in I2I tasks. In addition, we rethink the role played by discriminators in sampling patches and propose a discnative attention-guided (DAG) patch sampling strategy to replace random sampling.
arXiv Detail & Related papers (2022-11-20T04:39:57Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z)
Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Map, and Post-Quantization Filtering [15.056672221375104]
Deep learning-based image compression has achieved better ratedistortion (R-D) performance than the latest traditional method, H.266/VVC. Many leading learned schemes cannot maintain a good trade-off between performance and complexity. We propose an effcient and effective image coding framework, which achieves similar R-D performance with lower complexity than the state of the art.
arXiv Detail & Related papers (2022-06-21T09:34:29Z)
Lossless Compression with Probabilistic Circuits [42.377045986733776]
Probabilistic Circuits (PCs) are a class of neural networks involving $|p|$ computational units. We derive efficient encoding and decoding schemes that both have time complexity $mathcalO (log(D) cdot |p|)$, where a naive scheme would have linear costs in $D$ and $|p|$. By scaling up the traditional PC structure learning pipeline, we achieved state-of-the-art results on image datasets such as MNIST.
arXiv Detail & Related papers (2021-11-23T03:30:22Z)
An Information Theory-inspired Strategy for Automatic Network Pruning [88.51235160841377]
Deep convolution neural networks are well known to be compressed on devices with resource constraints. Most existing network pruning methods require laborious human efforts and prohibitive computation resources. We propose an information theory-inspired strategy for automatic model compression.
arXiv Detail & Related papers (2021-08-19T07:03:22Z)
CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade. CNNs are capable of learning robust representations of the data directly from the RGB pixels. Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z)
How to Exploit the Transferability of Learned Image Compression to Conventional Codecs [25.622863999901874]
We show how learned image coding can be used as a surrogate to optimize an image for encoding. Our approach can remodel a conventional image to adjust for the MS-SSIM distortion with over 20% rate improvement without any decoding overhead.
arXiv Detail & Related papers (2020-12-03T12:34:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.