Differentiable Earth Mover's Distance for Data Compression at the
High-Luminosity LHC
- URL: http://arxiv.org/abs/2306.04712v3
- Date: Fri, 29 Dec 2023 14:26:52 GMT
- Title: Differentiable Earth Mover's Distance for Data Compression at the
High-Luminosity LHC
- Authors: Rohan Shenoy and Javier Duarte and Christian Herwig and James
Hirschauer and Daniel Noonan and Maurizio Pierini and Nhan Tran and Cristina
Mantilla Suarez
- Abstract summary: We train a convolutional neural network (CNN) to learn a differentiable, fast approximation of the Earth mover's distance.
We apply this differentiable approximation in the training of an autoencoder-inspired neural network (encoder NN) for data compression at the high-luminosity CERN.
We demonstrate that the performance of our encoder NN trained using the differentiable CNN surpasses that of training with loss functions based on mean squared error.
- Score: 1.8355959631840502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Earth mover's distance (EMD) is a useful metric for image recognition and
classification, but its usual implementations are not differentiable or too
slow to be used as a loss function for training other algorithms via gradient
descent. In this paper, we train a convolutional neural network (CNN) to learn
a differentiable, fast approximation of the EMD and demonstrate that it can be
used as a substitute for computing-intensive EMD implementations. We apply this
differentiable approximation in the training of an autoencoder-inspired neural
network (encoder NN) for data compression at the high-luminosity LHC at CERN.
The goal of this encoder NN is to compress the data while preserving the
information related to the distribution of energy deposits in particle
detectors. We demonstrate that the performance of our encoder NN trained using
the differentiable EMD CNN surpasses that of training with loss functions based
on mean squared error.
Related papers
- Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Deep Spatiotemporal Clustering: A Temporal Clustering Approach for
Multi-dimensional Climate Data [0.353122873734926]
We propose a novel algorithm for high-dimensional temporal representation of data using an unsupervised deep learning method.
Inspired by U-net architecture, our algorithm utilizes an autoencoder integrating CNN-RNN layers to learn latent representations.
Our experiments show our approach outperforms both conventional and deep learning-based unsupervised clustering algorithms.
arXiv Detail & Related papers (2023-04-27T21:45:21Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Low-Energy Convolutional Neural Networks (CNNs) using Hadamard Method [0.0]
Convolutional neural networks (CNNs) are a potential approach for object recognition and detection.
A new approach based on the Hadamard transformation as an alternative to the convolution operation is demonstrated.
The method is helpful for other computer vision tasks when the kernel size is smaller than the input image size.
arXiv Detail & Related papers (2022-09-06T21:36:57Z) - DNN Training Acceleration via Exploring GPGPU Friendly Sparsity [16.406482603838157]
We propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and online generated row-based or tile-based dropout patterns.
We then develop a SGD-based Search Algorithm that produces the distribution of row-based or tile-based dropout patterns to compensate for the potential accuracy loss.
We also propose the sensitivity-aware dropout method to dynamically drop the input feature maps based on their sensitivity so as to achieve greater forward and backward training acceleration.
arXiv Detail & Related papers (2022-03-11T01:32:03Z) - Adaptive Latent Space Tuning for Non-Stationary Distributions [62.997667081978825]
We present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs.
We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator.
arXiv Detail & Related papers (2021-05-08T03:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.