Efficient Compression of Sparse Accelerator Data Using Implicit Neural   Representations and Importance Sampling
        - URL: http://arxiv.org/abs/2412.01754v1
 - Date: Mon, 02 Dec 2024 17:50:49 GMT
 - Title: Efficient Compression of Sparse Accelerator Data Using Implicit Neural   Representations and Importance Sampling
 - Authors: Xihaier Luo, Samuel Lurvey, Yi Huang, Yihui Ren, Jin Huang, Byung-Jun Yoon, 
 - Abstract summary: Large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates.<n>We propose a novel approach using implicit neural representations for data learning and compression.<n>We also introduce an importance sampling technique to accelerate the network training process.
 - Score: 7.838980097597047
 - License: http://creativecommons.org/licenses/by-nc-nd/4.0/
 - Abstract:   High-energy, large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates, reaching up to $1$ terabyte and several petabytes per second, respectively. The development of real-time, high-throughput data compression algorithms capable of reducing this data to manageable sizes for permanent storage is of paramount importance. A unique characteristic of the tracking detector data is the extreme sparsity of particle trajectories in space, with an occupancy rate ranging from approximately $10^{-6}$ to $10\%$. Furthermore, for downstream tasks, a continuous representation of this data is often more useful than a voxel-based, discrete representation due to the inherently continuous nature of the signals involved. To address these challenges, we propose a novel approach using implicit neural representations for data learning and compression. We also introduce an importance sampling technique to accelerate the network training process. Our method is competitive with traditional compression algorithms, such as MGARD, SZ, and ZFP, while offering significant speed-ups and maintaining negligible accuracy loss through our importance sampling strategy. 
 
       
      
        Related papers
        - Efficient Token Compression for Vision Transformer with Spatial   Information Preserved [59.79302182800274]
Token compression is essential for reducing the computational and memory requirements of transformer models.
We propose an efficient and hardware-compatible token compression method called Prune and Merge.
arXiv  Detail & Related papers  (2025-03-30T14:23:18Z) - Highly Efficient Direct Analytics on Semantic-aware Time Series Data   Compression [15.122371541057339]
We propose a novel method for direct analytics on time series data compressed by the SHRINK compression algorithm.
Our approach offers reliable, high-speed outlier detection analytics for diverse IoT applications.
arXiv  Detail & Related papers  (2025-03-17T14:58:22Z) - Prior-Fitted Networks Scale to Larger Datasets When Treated as Weak   Learners [82.72552644267724]
BoostPFN can outperform standard PFNs with the same size of training samples in large datasets.
High performance is maintained for up to 50x of the pre-training size of PFNs.
arXiv  Detail & Related papers  (2025-03-03T07:31:40Z) - Variable Rate Neural Compression for Sparse Detector Data [9.331686712558144]
We propose a novel approach for TPC data compression via key-point identification facilitated by sparse convolution.
BCAE-VS achieves a $75%$ improvement in reconstruction accuracy with a $10%$ increase in compression ratio over the previous state-of-the-art model.
arXiv  Detail & Related papers  (2024-11-18T17:15:35Z) - Compressing high-resolution data through latent representation encoding   for downscaling large-scale AI weather forecast model [10.634513279883913]
We propose a variational autoencoder framework tailored for compressing high-resolution datasets.
Our framework successfully reduced the storage size of 3 years of HRCLDAS data from 8.61 TB to just 204 GB, while preserving essential information.
arXiv  Detail & Related papers  (2024-10-10T05:38:03Z) - Computationally and Memory-Efficient Robust Predictive Analytics Using   Big Data [0.0]
This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data.
We utilize Robust Principal Component Analysis (RPCA) for effective noise reduction and outlier elimination, and Optimal Sensor Placement (OSP) for efficient data compression and storage.
arXiv  Detail & Related papers  (2024-03-27T22:39:08Z) - Enabling robust sensor network design with data processing and
  optimization making use of local beehive image and video files [0.0]
We of er a revolutionary paradigm that uses cutting-edge edge computing techniques to optimize data transmission and storage.
Our approach encompasses data compression for images and videos, coupled with a data aggregation technique for numerical data.
A key aspect of our approach is its ability to operate in resource-constrained environments.
arXiv  Detail & Related papers  (2024-02-26T15:27:47Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
  Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv  Detail & Related papers  (2022-10-21T07:25:57Z) - Deep DNA Storage: Scalable and Robust DNA Storage via Coding Theory and
  Deep Learning [49.3231734733112]
We show a modular and holistic approach that combines Deep Neural Networks (DNN) trained on simulated data, Product (TP) based Error-Correcting Codes (ECC) and a safety margin into a single coherent pipeline.
Our work improves upon the current leading solutions by up to x3200 increase in speed, 40% improvement in accuracy, and offers a code rate of 1.6 bits per base in a high noise regime.
arXiv  Detail & Related papers  (2021-08-31T18:21:20Z) - Energy-Efficient Model Compression and Splitting for Collaborative
  Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv  Detail & Related papers  (2021-06-02T07:36:27Z) - Adaptive Latent Space Tuning for Non-Stationary Distributions [62.997667081978825]
We present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs.
We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator.
arXiv  Detail & Related papers  (2021-05-08T03:50:45Z) - Towards an Interpretable Data-driven Trigger System for High-throughput
  Physics Facilities [7.939382824995354]
We introduce a new data-driven approach for designing high- throughput data filtering and trigger systems.
Our goal is to design a data-driven filtering system with a minimal run-time cost for determining which data event to keep.
We introduce key insights from interpretable predictive modeling and cost-sensitive learning in order to account for non-local inefficiencies in the current paradigm.
arXiv  Detail & Related papers  (2021-04-14T05:01:32Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
  Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv  Detail & Related papers  (2021-01-12T20:08:18Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.