Exploring Autoencoder-based Error-bounded Compression for Scientific
Data
- URL: http://arxiv.org/abs/2105.11730v7
- Date: Sat, 21 Oct 2023 22:26:08 GMT
- Title: Exploring Autoencoder-based Error-bounded Compression for Scientific
Data
- Authors: Jinyang Liu, Sheng Di, Kai Zhao, Sian Jin, Dingwen Tao, Xin Liang,
Zizhong Chen, Franck Cappello
- Abstract summary: We develop an error-bounded autoencoder-based framework in terms of the SZ model.
We optimize the compression quality for the main stages in our designed AE-based error-bounded compression framework.
- Score: 14.724393511470225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Error-bounded lossy compression is becoming an indispensable technique for
the success of today's scientific projects with vast volumes of data produced
during simulations or instrument data acquisitions. Not only can it
significantly reduce data size, but it also can control the compression errors
based on user-specified error bounds. Autoencoder (AE) models have been widely
used in image compression, but few AE-based compression approaches support
error-bounding features, which are highly required by scientific applications.
To address this issue, we explore using convolutional autoencoders to improve
error-bounded lossy compression for scientific data, with the following three
key contributions. (1) We provide an in-depth investigation of the
characteristics of various autoencoder models and develop an error-bounded
autoencoder-based framework in terms of the SZ model. (2) We optimize the
compression quality for the main stages in our designed AE-based error-bounded
compression framework, fine-tuning the block sizes and latent sizes and also
optimizing the compression efficiency of latent vectors. (3) We evaluate our
proposed solution using five real-world scientific datasets and compare them
with six other related works. Experiments show that our solution exhibits a
very competitive compression quality among all the compressors in our tests. In
absolute terms, it can obtain a much better compression quality (100% ~ 800%
improvement in compression ratio with the same data distortion) compared with
SZ2.1 and ZFP in cases with a high compression ratio.
Related papers
- Sparse $L^1$-Autoencoders for Scientific Data Compression [0.0]
We introduce effective data compression methods by developing autoencoders using high dimensional latent spaces that are $L1$-regularized.
We show how these information-rich latent spaces can be used to mitigate blurring and other artifacts to obtain highly effective data compression methods for scientific data.
arXiv Detail & Related papers (2024-05-23T07:48:00Z) - Compression of Structured Data with Autoencoders: Provable Benefit of
Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input.
We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture.
We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z) - Activations and Gradients Compression for Model-Parallel Training [85.99744701008802]
We study how simultaneous compression of activations and gradients in model-parallel distributed training setup affects convergence.
We find that gradients require milder compression rates than activations.
Experiments also show that models trained with TopK perform well only when compression is also applied during inference.
arXiv Detail & Related papers (2024-01-15T15:54:54Z) - SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression
with Super-resolution Neural Networks [13.706955134941385]
We propose SRN-SZ, a deep learning-based scientific error-bounded lossy compressor.
SRN-SZ applies the most advanced super-resolution network HAT for its compression.
In experiments, SRN-SZ achieves up to 75% compression ratio improvements under the same error bound.
arXiv Detail & Related papers (2023-09-07T22:15:32Z) - Hierarchical Autoencoder-based Lossy Compression for Large-scale High-resolution Scientific Data [12.831138965071945]
This work presents a neural network that significantly compresses large-scale scientific data, but also maintains high reconstruction quality.
The proposed model is tested with scientific benchmark data available publicly and applied to a large-scale high-resolution climate modeling data set.
Our model achieves a compression ratio of 140 on several benchmark data sets without compromising the reconstruction quality.
arXiv Detail & Related papers (2023-07-09T16:11:02Z) - Scalable Hybrid Learning Techniques for Scientific Data Compression [6.803722400888276]
Scientists require compression techniques that accurately preserve derived quantities of interest (QoIs)
This paper presents a physics-informed compression technique implemented as an end-to-end, scalable, GPU-based pipeline for data compression.
arXiv Detail & Related papers (2022-12-21T03:00:18Z) - Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image
Compression [85.93207826513192]
We propose a unified and powerful deep lossy plus residual (DLPR) coding framework for both lossless and near-lossless image compression.
We solve the joint lossy and residual compression problem in the approach of VAEs.
In the near-lossless mode, we quantize the original residuals to satisfy a given $ell_infty$ error bound.
arXiv Detail & Related papers (2022-09-11T12:11:56Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets.
We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.