Deep Compression of Neural Networks for Fault Detection on Tennessee
Eastman Chemical Processes
- URL: http://arxiv.org/abs/2101.06993v1
- Date: Mon, 18 Jan 2021 10:53:12 GMT
- Title: Deep Compression of Neural Networks for Fault Detection on Tennessee
Eastman Chemical Processes
- Authors: Mingxuan Li, Yuanxun Shao
- Abstract summary: Three deep compression techniques are applied to reduce the computational burden.
The best result is applying all three techniques, which reduces the model sizes by 91.5% and remains a high accuracy over 94%.
- Score: 2.297079626504224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial neural network has achieved the state-of-art performance in fault
detection on the Tennessee Eastman process, but it often requires enormous
memory to fund its massive parameters. In order to implement online real-time
fault detection, three deep compression techniques (pruning, clustering, and
quantization) are applied to reduce the computational burden. We have
extensively studied 7 different combinations of compression techniques, all
methods achieve high model compression rates over 64% while maintain high fault
detection accuracy. The best result is applying all three techniques, which
reduces the model sizes by 91.5% and remains a high accuracy over 94%. This
result leads to a smaller storage requirement in production environments, and
makes the deployment smoother in real world.
Related papers
- Theoretical Guarantees for Low-Rank Compression of Deep Neural Networks [5.582683296425384]
Deep neural networks have achieved state-of-the-art performance across numerous applications.
Low-rank approximation techniques offer a promising solution by reducing the size and complexity of these networks.
We develop an analytical framework for data-driven post-training low-rank compression.
arXiv Detail & Related papers (2025-02-04T23:10:13Z) - Edge AI: Evaluation of Model Compression Techniques for Convolutional Neural Networks [0.0]
This work evaluates the compression techniques on ConvNeXt models in image classification tasks using the CIFAR-10 dataset.
Results show significant reductions in model size, with up to 75% reduction achieved using structured pruning techniques.
Dynamic quantization achieves a reduction of up to 95% in the number of parameters.
arXiv Detail & Related papers (2024-09-02T11:48:19Z) - Towards efficient deep autoencoders for multivariate time series anomaly
detection [0.8681331155356999]
We propose a novel compression method for deep autoencoders that involves three key factors.
First, pruning reduces the number of weights, while preventing catastrophic drops in accuracy by means of a fast search process.
Second, linear and non-linear quantization reduces model complexity by reducing the number of bits for every single weight.
arXiv Detail & Related papers (2024-03-04T19:22:09Z) - CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks [1.5199992713356987]
This paper introduces CompactifAI, an innovative compression approach using quantum-inspired networks.
Our method is versatile and can be implemented with - or on top of - other compression techniques.
As a benchmark, we demonstrate that a combination of CompactifAI with quantization allows to reduce a 93% memory size of LlaMA 7B.
arXiv Detail & Related papers (2024-01-25T11:45:21Z) - Advancing The Rate-Distortion-Computation Frontier For Neural Image
Compression [6.167676495563641]
Rate-distortion-computation study shows that neither floating-point operations (FLOPs) nor runtime are sufficient on their own to accurately rank neural compression methods.
We identify a novel neural compression architecture that yields state-of-the-art RD performance with rate savings of 23.1% over BPG.
arXiv Detail & Related papers (2023-09-26T19:47:31Z) - Learning Accurate Performance Predictors for Ultrafast Automated Model
Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment.
Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z) - Pushing the Limits of Asynchronous Graph-based Object Detection with
Event Cameras [62.70541164894224]
We introduce several architecture choices which allow us to scale the depth and complexity of such models while maintaining low computation.
Our method runs 3.7 times faster than a dense graph neural network, taking only 8.4 ms per forward pass.
arXiv Detail & Related papers (2022-11-22T15:14:20Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - ScaleCom: Scalable Sparsified Gradient Compression for
Communication-Efficient Distributed Training [74.43625662170284]
Large-scale distributed training of Deep Neural Networks (DNNs) on state-of-the-art platforms is expected to be severely communication constrained.
We propose a new compression technique that leverages similarity in the gradient distribution amongst learners to provide significantly improved scalability.
We experimentally demonstrate that ScaleCom has small overheads, directly reduces gradient traffic and provides high compression rates (65-400X) and excellent scalability (up to 64 learners and 8-12X larger batch sizes over standard training) without significant accuracy loss.
arXiv Detail & Related papers (2021-04-21T02:22:10Z) - An Efficient Statistical-based Gradient Compression Technique for
Distributed Training Systems [77.88178159830905]
Sparsity-Inducing Distribution-based Compression (SIDCo) is a threshold-based sparsification scheme that enjoys similar threshold estimation quality to deep gradient compression (DGC)
Our evaluation shows SIDCo speeds up training by up to 41:7%, 7:6%, and 1:9% compared to the no-compression baseline, Topk, and DGC compressors, respectively.
arXiv Detail & Related papers (2021-01-26T13:06:00Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.