Artificial neural network has achieved the state-of-art performance in fault
detection on the Tennessee Eastman process, but it often requires enormous
memory to fund its massive parameters. In order to implement online real-time
fault detection, three deep compression techniques (pruning, clustering, and
quantization) are applied to reduce the computational burden. We have
extensively studied 7 different combinations of compression techniques, all
methods achieve high model compression rates over 64% while maintain high fault
detection accuracy. The best result is applying all three techniques, which
reduces the model sizes by 91.5% and remains a high accuracy over 94%. This
result leads to a smaller storage requirement in production environments, and
makes the deployment smoother in real world.
Artificial neural network has achieved the state-of-art performance in fault detection on the Tennessee Eastman process, but it often requires enormous memory to fund its massive parameters.
In order to implement online real-time fault detection, three deep compression techniques (pruning, clustering, and quantization) are applied to reduce the computational burden.
We have extensively studied 7 different combinations of compression techniques, all methods achieve high model compression rates over 64% while maintain high fault detection accuracy.
Abstract Deep Compression, Neural Networks, Fault Detection
概要 深部圧縮, ニューラルネットワーク, 故障検出
0.60
Index Terms I. INTRODUCTION Artificial Neural Network (ANN) is a powerful technique that can classify input data into predefined categories.
索引項 I 導入 人工ニューラルネットワーク(ANN)は、入力データを予め定義されたカテゴリに分類できる強力な技術である。
0.56
ANN has been successfully deployed in an enormous amount of real-life applications with superior results, including computer vision [1], natural language processing [2], autonomous driving [3].
For a classification problem, input data is processed through a series of weight matrices, bias vectors, and nonlinear activation functions.
分類問題では、入力データは一連の重み行列、バイアスベクトル、非線形活性化関数によって処理される。
0.80
Following by a classifier, ANN eventually calculates a likelihood score for each category.
分類器に従って、ANNは最終的に各カテゴリの確率スコアを算出する。
0.72
During training, ANN maximizes the likelihood of the true category (or minimize the negative of this likelihood) for taring data sets via variants of stochastic gradient decent methods.
In terms of the inference, we run the forward path and classify testing data to the category with the highest probability score.
推論の面では、フォワードパスを実行し、最も高い確率スコアを持つカテゴリにテストデータを分類します。
0.67
Even though ANN has shown great success in many fields, sometimes the enormous model size caused by a large number of weights can result in overwhelming computational burden (e g taking huge memory and being very power demanding).
In certain application scenarios that have memory or energy consumption limitations, such as mobile platform or browser-based systems, a smaller size ANN with high accuracy is not only necessary but also hold in high demand.
As the complexity of production increases dramatically due to more advanced technologies, deep learning has become an increasingly popular candidate for fault detection applications due to its potential in handling complex systems, including a large amount of measurements and many failure types for processes.
To achieve the best accuracy, instead of using a single ANN to classify the fault type, the result in [5] uses different ANNs for different fault types and achieves 97.73% average accuracy.
1つのANNを使用して障害タイプを分類するのではなく、[5]の結果は異なる障害タイプに対して異なるANNを使用して、平均精度97.73%を達成する。 訳抜け防止モード: 単一ANNを使用して障害タイプを分類する代わりに、最良の精度を実現する。 5 ] の結果は、異なる障害タイプに異なる ANN を使用します。 平均精度は97.73パーセントです
0.73
However, this method requires one deep learning model for each fault type, which requires large storage space and computational efforts in total.
Thus, the ANN pruning is an ideal candidate for reducing the computational burden, speeding up the online inference time, as well as maintaining the classification accuracy.
In this paper, deep compression of artificial neural networks (ANN) is used for the fault detection on Tennessee Eastman chemical process to enable faster online inferences with high accuracy.
An ANN usually has 4 different types of layer: input layer, hidden layer, softmax layer, and output layer.
ANNは通常、入力層、隠れ層、ソフトマックス層、出力層という4つの異なるタイプの層を持つ。
0.76
Input layer is a weight matrix designed to pass the input data into the ANN through a linear mapping.
入力層は、入力データを線形マッピングを介してANNに渡すように設計された重み行列である。
0.79
In Hidden layers, the input information is transformed through a matrix multiplication and an activation function: h1 = σ(W1x + b1), hi = σ(Wihi−1 + bi), i = {2, 3, 4, ..., k}, where x ∈ Rnx is the vector of input data, hi ∈ Rnhi is the ith hidden layer representation, Wi ∈ Rnhi×nhi−1 and bi ∈ Rnhi are weights and bias for connecting the i-th and (i-1)-th hidden representations, k is the number of hidden layers, and σ is a nonlinear activation function that introduces nonlinearity into the neural network.
In Hidden layers, the input information is transformed through a matrix multiplication and an activation function: h1 = σ(W1x + b1), hi = σ(Wihi−1 + bi), i = {2, 3, 4, ..., k}, where x ∈ Rnx is the vector of input data, hi ∈ Rnhi is the ith hidden layer representation, Wi ∈ Rnhi×nhi−1 and bi ∈ Rnhi are weights and bias for connecting the i-th and (i-1)-th hidden representations, k is the number of hidden layers, and σ is a nonlinear activation function that introduces nonlinearity into the neural network.
0.91
Specifically, ReLU activation function, σ(x) = max(0, x), is used in this paper.
具体的には, ReLU 活性化関数 σ(x) = max(0, x) を用いる。
0.73
The last fully-connected hidden layer does not impose the activation function.
最後の完全に接続された隠蔽層はアクティベーション機能を強制しない。
0.64
efyi(cid:80) j efj , where fyi is the score of the correct category, Then, a softmax layer calculates the score/probability of each class via and fj is the score of the jth catagory.
efyi (cid:80) j efj , where fyi is the score of the correct category, then, a softmax layer determine the score/probability of each class via and fj is the score of the jth catagory。
0.90
Finally, the loss is calculated based on the true label and regularization.
最後に、損失は真のラベルと正規化に基づいて計算される。
0.71
The objective of the network is to minimize the loss and maximum the accuracy of both training and testing datasets.
ネットワークの目的は、データセットのトレーニングとテストの両方の損失と精度を最小化することである。
0.82
Neural networks usually require enormous memory to fund their massive parameters.
ニューラルネットワークは通常、巨大なパラメータに資金を提供するために膨大なメモリを必要とする。
0.50
In this section, three deep compression
本節では3つの深い圧縮について述べる。
0.53
techniques are discussed to reduce the model size and keep high accuracy.
モデルサイズを小さくし,高精度を維持する技術について論じる。
0.73
Fig. 1. ANN Structure
フィギュア。 1. ANN構造
0.70
III. DEEP COMPRESSION
III。 ディープ圧縮
0.60
2
2
0.85
英語(論文から抽出)
日本語訳
スコア
A. Pruning Network pruning utilizes the connectivity between neuron s. Small weights under a predefined threshold are removed to reduce the model size.
Clustering is a technique that groups weights with close values.
クラスタリングは、密接な値で重みをグループ化するテクニックである。
0.64
First, weights are divided into a fixed number of groups.
まず、重みを一定数の群に分割する。
0.54
Second, values in the same group will be reset as the same value and thus all weights in the same group can be referenced to the same address to reduce model size.
The gradient of each weight is calculated through back-propagation and the gradient of each group is the sum over all elements.
各重みの勾配はバックプロパゲーションによって計算され、各群の勾配はすべての要素の和である。
0.80
Finally, weights are updated through the cluster groups.
最後に、重みはクラスタグループを通して更新される。
0.68
Clustering essentially minimizes the amount of references needed for a model, and thus reduces the storage requirement of this model.
クラスタリングは基本的に、モデルに必要な参照量を最小化し、このモデルのストレージ要件を削減します。
0.74
Fig. 3. An illustration of the gradient update for clustered weights: the top left figure is the original weight matrix colored according to clustered groups; the top middle figure is the corresponding group indices; the bottom left figure is the gradient of the weight matrix; the bottom middle figure is the grouped gradients according to the centroids; the bottom right figure shows the sum of gradients of each group; and the top right figure represent the final gradient update of clusters.
In the quantization process, the precision used to store weights is degraded in the exchange for a lower space consumption.
量子化過程において、重量を保存するために用いられる精度は、より少ない空間消費と引き換えに劣化する。
0.65
In Figure 4, weights in the left figure are stored in one decimal, while the right figure reduces the storage requirement via rounding off the decimal value.
Fig. 4. The left figure is the weight matrix before quantization; the right figure is the weight matrix after quantization.
フィギュア。 4. 左図は量子化前の重み行列であり、右図は量子化後の重み行列である。
0.66
The Tennessee Eastman process is a benchmark for fault detection [5].
テネシー・イーストマン・プロセスは断層検出のベンチマークである[5]。
0.58
Based on a baseline ANN structure, various
ベースラインANN構造に基づく各種
0.58
combinations of deep compression techniques are applied to reduce the model size and achieve a high accuracy.
モデルサイズの削減と高精度化のために, 深部圧縮技術の組み合わせを適用した。
0.84
IV. FAULT DETECTION FOR THE TENNESSEE EASTMAN CHEMICAL PROCESS
IV。 テネシー・イーストマン化学プロセスにおける断層検出
0.68
A. Tennessee Eastman Process Description
A. Tennessee Eastman Process Description
0.99
As summarized in Table I, the Tennessee Eastman process provides 52 measurements and 21 different fault types (including ”no fault” as faultNumber 0) in its dataset.
表1で要約したように、テネシー・イーストマンプロセスはデータセットに52の計測値と21の異なる障害タイプ("no fault" as faultnumber 0)を提供する。
0.77
Moreover, Chiang [6] and Zhang [7] pointed out that due to the absence of observable change in the mean, variance and the higher order variances in the data, we do not have enough information to detect Fault 3, 9, and 15.
Therefore, this paper follows the common rule to exclude the three faults above for the consideration, resulting in 18 classification types (including ”no failure” as faultNumber 0).
The input layer dimension is 52, which corresponds to the 52 measurement.
入力層寸法は52であり、52測定に対応する。
0.68
The output layer has a dimension of 2, which corresponds to the 2 possible outcomes (fault or no fault).
出力層は2の次元を持ち、2つの可能な結果に対応する(フォールトかノーか)。
0.72
The resulting 18 models are served as the baseline for comparison with compressed models.
結果の18モデルは圧縮モデルと比較してベースラインとして提供される。
0.82
C. Deep Compression Results In this section, all 7 different combinations of pruning, clustering, and quantization are applied to the 18 ANNs.
c. 深部圧縮結果 この節では、18個のANNに対してプルーニング、クラスタリング、量子化の7つの組み合わせを適用する。
0.76
First, each of the three compression techniques is applied individually to exam the performance.
まず,3つの圧縮手法をそれぞれ個別に適用し,性能評価を行う。
0.82
Then, each two of the compression techniques applied to the original 18 ANNs.
そして、各2つの圧縮技術が元の18アンに適用される。
0.82
At Last, all three techniques are applied to the original ANNs to compare the compression rate and accuracy change.
最後に、圧縮率と精度変化を比較するために、3つの手法が元のANNに適用される。
0.74
The compression results of ANNs obtained by applying each of the three techniques individually are shown in Table II, III, and IV.
3つの技法をそれぞれ個別に適用したANNの圧縮結果を表II,III,IVに示す。
0.68
As shown in Table II, when only pruning is applied, the average compressed rate is 64.0% , and the average accuracy change is -0.5% for all 18 different types of fault labels.
TABLE IV Next, a comprehensive study of applying two compression techniques are conducted.
テーブルIV 次に,2つの圧縮技術を適用した総合的研究を行った。
0.67
Table V shows that when utilizing both clustering and pruning, the average compressed rate increases to 87.3%, and the average accuracy change is only -1.9%.
表 v は、クラスタリングと刈り取りの両方を利用すると、平均圧縮速度は87.3%まで増加し、平均精度の変化は -1.9% であることを示している。 訳抜け防止モード: 表 V はクラスタリングとプルーニングの両方を利用するとき、 圧縮率の平均は87.3%に上昇します 平均精度の変化は-1.9%です
0.81
Moreover, the compression result is consistent among all fault types.
さらに、すべてのフォールトタイプで圧縮結果が一貫性がある。
0.73
The variance of compressed rate and accuracy change are respectively 0.16 and 1.72, which shows that there are few fluctuations in compression rate and accuracy change, thus the average compression rate and average accuracy change are reliably representative.
Table VI shows the compression result with both pruning and quantization.
テーブルVIは、プルーニングと量子化の両方で圧縮結果を示す。
0.71
The average compression rate is 88.1%, which is about the same as the result in Table V. But the average accuracy change drops to -5.3%, which is the largest drop among methods that apply two compression techniques.
平均圧縮速度は88.1%であり、表 v の結果とほぼ同じであるが、平均精度の変化は -5.3% に低下し、2つの圧縮技術を適用する方法の中で最大となる。
0.86
Table VII presents the compression result with both clustering and pruning.
Table VIIはクラスタリングとプルーニングの両方で圧縮結果を表示する。
0.77
The average accuracy change is -1.8%, which is close to the average accuracy change in Table V. The compression rate is 82.2%, which is 5.1% smaller than the result in Table V. In general, applying two compression techniques achieves better results than only applying one single technique.
平均精度の変化は -1.8% であり、テーブル v の平均精度変化に近い。圧縮速度は 82.2% であり、テーブル v の結果よりも5.1% 小さい。 訳抜け防止モード: 平均精度変化は-1.8%であり、テーブルvの平均精度変化に近い。圧縮速度は82.2%である。 これは表 v の値よりも5.1 %小さい。 2つの圧縮技術を適用する 1つのテクニックだけを適用するよりも、よりよい結果を得るのです。
COMPRESSION RESULTS WITH CLUSTERING AND QUANTIZATION
クラスタリングおよび定量化による圧縮応答
0.56
TABLE VII Finally, we exam the performance of applying all three compression techniques.
テーブルVII 最後に,3つの圧縮技術すべてを適用した性能を検証した。
0.67
Table VIII shows that the average compressed rate increases to 91.5% while the average accuracy change remains relatively small at -1.8%.
表VIIIでは、平均圧縮速度は91.5%まで上昇し、平均精度は-1.8%と比較的小さい。
0.70
The variances of the compressed rates and accuracy changes are only 0.14 and 1.83, correspondingly, which shows the consistency of the performance among all fault types.
Compare to results of applying two techniques in Table V, Table VI, and Table VII, this method achieves the best compressed rate with no extra loss in accuracy, which indicates applying all three techniques is better than only applying two techniques for fault diagnosis in the TE process.
Compared to the results of applying only one compression technique in Table II, III, and IV, although average accuracy change decreases slightly, applying all three techniques still achieves accuracies higher than 94% for all fault types.
表II, III, IVに1つの圧縮技術を適用した結果と比較すると, 平均精度はわずかに低下するが, これら3つの手法を適用した場合, 全ての断層タイプに対して94%以上の精度が得られる。
0.78
At the same time, the compressed rate 91.5% is significantly higher than 76%, the best result of applying only one technique.
同時に、圧縮率91.5%は76%よりも著しく高く、1つの技術のみを適用する最も良い結果である。
0.86
The results of all 7 different combination of compression techniques are summarized in Figure 5.
全7種類の圧縮技術の組み合わせの結果を図5にまとめる。
0.76
It is clear that applying all three techniques achieves the highest compressed rate, as well as maintains a high average accuracy above 94%.
COMPRESSION STATISTICS WITH PRUNING, CLUSTERING, AND QUANTIZATION
印刷、クラスタリング及び定量化を伴う圧縮安定性
0.57
TABLE VIII Fig.
テーブルVIII フィギュア。
0.54
5. The left plot shows compressed rates of all fault types over 7 different combinations of compression techniques, where P = pruning, C = clustering, and Q = quantization.
The right plot shows the average accuracy over all fault types with 7 different combinations of compression techniques.
正しいプロットは、7つの異なる圧縮技術の組み合わせで、すべての障害タイプの平均精度を示す。
0.72
Results of different compression methods are colored consistently in two plots.
異なる圧縮方法の結果は、2つのプロットで一貫して色付けされる。
0.64
This paper studies deep compression techniques for fault diagnosis on the Tennessee Eastman process.
本稿では,テネシー州イーストマン過程における断層診断のための深部圧縮技術について検討する。
0.56
In respond to the demand of fast online detection for industrial processes, three compression techniques (pruning, clustering, and quantization) are applied to reduce model size and computational complexity.
We have examined a comprehensive list of 7 different combinations of compression techniques.
我々は7種類の圧縮技術の組み合わせの包括的リストを調査した。
0.77
All methods achieve high model compression rates over 64% while maintaining high fault detection
高破壊検出を維持しながら64%以上の高モデル圧縮率を達成する方法
0.82
V. CONCLUSION
V.コンキュレーション
0.76
8
8
0.85
英語(論文から抽出)
日本語訳
スコア
accuracy. The best candidate for fault detection on the Tennessee Eastman chemical process is applying all three techniques, which reduces model sizes by 91.5% and remains a high accuracy over 94%.
This result leads to smaller storage requirement in production environment, and makes the deployment smoother in real world.
その結果、本番環境でのストレージ要件が小さくなり、実世界でのデプロイメントがよりスムーズになる。
0.69
REFERENCES [1] Dan Ciresan, Ueli Meier, and J¨urgen Schmidhuber.
参考 [1] dan ciresan, ueli meier, j surgen schmidhuber。
0.46
Multi-column deep neural networks for image classification.
画像分類のための多列深層ニューラルネットワーク
0.79
In IN PROCEEDINGS OF THE 25TH
25世紀の訴訟において
0.50
IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2012, pages 3642–3649, 2012.
IEEE Conference on Computer Vision and PATTERN Recognition (CVPR 2012, pages 3642–3649, 2012)
0.81
[2] Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts.
[2] Richard Socher、Alex Perelygin、Jean Wu、Jason Chuang、Christopher D. Manning、Andrew Ng、Christopher Potts。 訳抜け防止モード: [2 ]Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang氏、Christopher D. Manning氏、Andrew Ng氏、Christopher Potts氏。
0.87
Recursive deep models for semantic compositionality over a sentiment treebank.
感情木バンク上の意味的構成性の再帰的深部モデル
0.62
In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA, October 2013.
2013年10月、ワシントン州シアトルの1631-1642ページで、自然言語処理における経験的方法に関する会議が開催された。 訳抜け防止モード: 自然言語処理における経験的手法に関する2013年会議のまとめ 1631-1642, Seattle, Washington, USA, October 2013 (英語)
0.83
Association for Computational Linguistics. [3] C. Chen, A. Seff, A. Kornhauser, and J. Xiao.
計算言語学会会員。 [3] c. chen, a. seff, a. kornhauser, j. xiao。
0.64
Deepdriving: Learning affordance for direct perception in autonomous driving.
deepdriving: 自動運転における直接認識のための学習能力。
0.70
In 2015 IEEE International
2015年ieeeインターナショナル
0.65
Conference on Computer Vision (ICCV), pages 2722–2730, 2015.
コンピュータビジョンに関する会議(iccv)、2015年2722-2730頁。
0.76
[4] J.J. Downs and E.F. Vogel.
J.J. DownsとE.F. Vogel
0.79
A plant-wide industrial process control problem.
プラント全体にわたる産業プロセス制御問題。
0.78
Computers & Chemical Engineering, 17(3):245 – 255, 1993.
computer & chemical engineering, 17(3):245 – 255, 1993年。
0.89
Industrial challenge problems in process control.
産業 プロセス制御における課題。
0.75
[5] Seongmin Heo and Jay H. Lee.
5]Seongmin HeoとJay H. Lee。
0.76
Fault detection and classification using artificial neural networks.
ニューラルネットワークを用いた障害検出と分類
0.70
IFAC-PapersOnLine, 51(18):470 – 475, 2018.
IFAC-PapersOnLine, 51(18):470–475, 2018
0.89
10th IFAC Symposium on Advanced Control of Chemical Processes ADCHEM 2018.
10日 IFAC Symposium on Advanced Control of Chemical Processes ADCHEM 2018
0.67
[6] Leo Chiang, E. Russell, and Richard Braatz.
6]Leo Chiang、E. Russell、Richard Braatz。
0.70
Fault detection and diagnosis in industrial systems.
産業システムにおける故障検出と診断
0.84
Measurement Science and Technology - MEAS SCI
計測科学と技術 -MEAS SCI
0.82
TECHNOL, 12, 10 2001.
テクノ、2001年12月12日。
0.61
[7] Ying wei Zhang.
[7] Ying wei Zhang.
0.85
Enhanced statistical analysis of nonlinear processes using kpca, kica and svm.