Spatiotemporally adaptive compression for scientific dataset with
feature preservation -- a case study on simulation data with extreme climate
events analysis
- URL: http://arxiv.org/abs/2401.03317v1
- Date: Sat, 6 Jan 2024 22:32:34 GMT
- Title: Spatiotemporally adaptive compression for scientific dataset with
feature preservation -- a case study on simulation data with extreme climate
events analysis
- Authors: Qian Gong, Chengzhu Zhang, Xin Liang, Viktor Reshniak, Jieyang Chen,
Anand Rangarajan, Sanjay Ranka, Nicolas Vidal, Lipeng Wan, Paul Ullrich,
Norbert Podhorszki, Robert Jacob, Scott Klasky
- Abstract summary: We propose a technique that addresses storage costs while improving post-analysis accuracy through adaptive, error-controlled lossy compression.
We integrate cyclone feature detection with data compression and demonstrate that performing adaptive error-bounded compression in higher dimensional space enables greater compression ratios.
- Score: 11.299989876672605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific discoveries are increasingly constrained by limited storage space
and I/O capacities. For time-series simulations and experiments, their data
often need to be decimated over timesteps to accommodate storage and I/O
limitations. In this paper, we propose a technique that addresses storage costs
while improving post-analysis accuracy through spatiotemporal adaptive,
error-controlled lossy compression. We investigate the trade-off between data
precision and temporal output rates, revealing that reducing data precision and
increasing timestep frequency lead to more accurate analysis outcomes.
Additionally, we integrate spatiotemporal feature detection with data
compression and demonstrate that performing adaptive error-bounded compression
in higher dimensional space enables greater compression ratios, leveraging the
error propagation theory of a transformation-based compressor.
To evaluate our approach, we conduct experiments using the well-known E3SM
climate simulation code and apply our method to compress variables used for
cyclone tracking. Our results show a significant reduction in storage size
while enhancing the quality of cyclone tracking analysis, both quantitatively
and qualitatively, in comparison to the prevalent timestep decimation approach.
Compared to three state-of-the-art lossy compressors lacking feature
preservation capabilities, our adaptive compression framework improves
perfectly matched cases in TC tracking by 26.4-51.3% at medium compression
ratios and by 77.3-571.1% at large compression ratios, with a merely 5-11%
computational overhead.
Related papers
- EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation [79.56709262189953]
EoRA consistently outperforms previous methods in compensating errors for compressed LLaMA2/3 models on various tasks.
EoRA offers a scalable, training-free solution to compensate for compression errors.
arXiv Detail & Related papers (2024-10-28T17:59:03Z) - Fast Feedforward 3D Gaussian Splatting Compression [55.149325473447384]
3D Gaussian Splatting (FCGS) is an optimization-free model that can compress 3DGS representations rapidly in a single feed-forward pass.
FCGS achieves a compression ratio of over 20X while maintaining fidelity, surpassing most per-scene SOTA optimization-based methods.
arXiv Detail & Related papers (2024-10-10T15:13:08Z) - Enhancing Lossy Compression Through Cross-Field Information for Scientific Applications [11.025583805165455]
Lossy compression is one of the most effective methods for reducing the size of scientific data containing multiple data fields.
Previous approaches use local information from a single target field when predicting target data points, limiting their potential to achieve higher compression ratios.
We propose a novel hybrid prediction model that utilizes CNN to extract cross-field information and combine it with existing local field information.
arXiv Detail & Related papers (2024-09-26T21:06:53Z) - Sparse $L^1$-Autoencoders for Scientific Data Compression [0.0]
We introduce effective data compression methods by developing autoencoders using high dimensional latent spaces that are $L1$-regularized.
We show how these information-rich latent spaces can be used to mitigate blurring and other artifacts to obtain highly effective data compression methods for scientific data.
arXiv Detail & Related papers (2024-05-23T07:48:00Z) - Convolutional variational autoencoders for secure lossy image compression in remote sensing [47.75904906342974]
This study investigates image compression based on convolutional variational autoencoders (CVAE)
CVAEs have been demonstrated to outperform conventional compression methods such as JPEG2000 by a substantial margin on compression benchmark datasets.
arXiv Detail & Related papers (2024-04-03T15:17:29Z) - Learning Accurate Performance Predictors for Ultrafast Automated Model
Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment.
Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z) - On the Interaction Between Differential Privacy and Gradient Compression
in Deep Learning [55.22219308265945]
We study how the Gaussian mechanism for differential privacy and gradient compression jointly impact test accuracy in deep learning.
We observe while gradient compression generally has a negative impact on test accuracy in non-private training, it can sometimes improve test accuracy in differentially private training.
arXiv Detail & Related papers (2022-11-01T20:28:45Z) - Efficient Data Compression for 3D Sparse TPC via Bicephalous
Convolutional Autoencoder [8.759778406741276]
This work introduces a dual-head autoencoder to resolve sparsity and regression simultaneously, called textitBicephalous Convolutional AutoEncoder (BCAE)
It shows advantages both in compression fidelity and ratio compared to traditional data compression methods, such as MGARD, SZ, and ZFP.
arXiv Detail & Related papers (2021-11-09T21:26:37Z) - Exploring Autoencoder-based Error-bounded Compression for Scientific
Data [14.724393511470225]
We develop an error-bounded autoencoder-based framework in terms of the SZ model.
We optimize the compression quality for the main stages in our designed AE-based error-bounded compression framework.
arXiv Detail & Related papers (2021-05-25T07:53:32Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.