Related papers: Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets

Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets

URL: http://arxiv.org/abs/2403.15953v1
Date: Sat, 23 Mar 2024 23:14:37 GMT
Title: Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets
Authors: Robert Underwood, Jon C. Calhoun, Sheng Di, Franck Cappello,
Abstract summary: Learning and Artificial Intelligence (ML/AI) techniques have become increasingly prevalent in high performance computing. Data compression can be a solution to these problems, but an in-depth understanding of how lossy compression affects model quality is needed. We show modern lossy compression methods can achieve a 50-100x compression ratio improvement for a 1% or less loss in quality.
Score: 7.261516807130813
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning and Artificial Intelligence (ML/AI) techniques have become increasingly prevalent in high performance computing (HPC). However, these methods depend on vast volumes of floating point data for training and validation which need methods to share the data on a wide area network (WAN) or to transfer it from edge devices to data centers. Data compression can be a solution to these problems, but an in-depth understanding of how lossy compression affects model quality is needed. Prior work largely considers a single application or compression method. We designed a systematic methodology for evaluating data reduction techniques for ML/AI, and we use it to perform a very comprehensive evaluation with 17 data reduction methods on 7 ML/AI applications to show modern lossy compression methods can achieve a 50-100x compression ratio improvement for a 1% or less loss in quality. We identify critical insights that guide the future use and design of lossy compressors for ML/AI.

Related papers

A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection [13.783950035836593]
Training and deploying deepfake detection models on edge devices offers the advantage of maintaining data privacy and confidentiality by processing it close to its source. We explore compression techniques to reduce computational demands and inference time, alongside transfer learning methods to minimize training overhead.
arXiv Detail & Related papers (2025-04-29T13:37:21Z)
Efficient Token Compression for Vision Transformer with Spatial Information Preserved [59.79302182800274]
Token compression is essential for reducing the computational and memory requirements of transformer models. We propose an efficient and hardware-compatible token compression method called Prune and Merge.
arXiv Detail & Related papers (2025-03-30T14:23:18Z)
CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression. We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations. During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT) RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z)
Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity [55.03958223190181]
We propose the first theoretically grounded accelerated algorithms utilizing unbiased and biased compression under data similarity. Our results are of record and confirmed by experiments on different average losses and datasets.
arXiv Detail & Related papers (2024-12-21T00:40:58Z)
Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression [10.233937665979694]
DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. We introduce a method that employs error-bounded lossy compression to reduce the communication data size and accelerate DLRM training.
arXiv Detail & Related papers (2024-07-05T05:55:18Z)
Sparse $L^1$-Autoencoders for Scientific Data Compression [0.0]
We introduce effective data compression methods by developing autoencoders using high dimensional latent spaces that are $L1$-regularized. We show how these information-rich latent spaces can be used to mitigate blurring and other artifacts to obtain highly effective data compression methods for scientific data.
arXiv Detail & Related papers (2024-05-23T07:48:00Z)
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z)
Scalable Hybrid Learning Techniques for Scientific Data Compression [6.803722400888276]
Scientists require compression techniques that accurately preserve derived quantities of interest (QoIs) This paper presents a physics-informed compression technique implemented as an end-to-end, scalable, GPU-based pipeline for data compression.
arXiv Detail & Related papers (2022-12-21T03:00:18Z)
Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging. We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z)
DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression [20.311114684028375]
We propose DeepSketch, a new reference search technique for post-deduplication delta compression. DeepSketch uses a deep neural network to extract a data block's sketch, i.e., to create an approximate data signature of the block. Our evaluation shows that DeepSketch improves the data-reduction ratio by up to 33% (21% on average) over a state-of-the-art post-deduplication delta-compression technique.
arXiv Detail & Related papers (2022-02-17T16:00:22Z)
An Information Theory-inspired Strategy for Automatic Network Pruning [88.51235160841377]
Deep convolution neural networks are well known to be compressed on devices with resource constraints. Most existing network pruning methods require laborious human efforts and prohibitive computation resources. We propose an information theory-inspired strategy for automatic model compression.
arXiv Detail & Related papers (2021-08-19T07:03:22Z)
Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets. We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z)
PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers. Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF) ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.