Despite the fact that cryptocurrencies themselves have experienced an
astonishing rate of adoption over the last decade, cryptocurrency fraud
detection is a heavily under-researched problem area. Of all fraudulent
activity regarding cryptocurrencies, pump and dump schemes are some of the most
common. Though some studies have been done on these kinds of scams in the stock
market, the lack of labelled stock data and the volatility unique to the
cryptocurrency space constrains the applicability of studies on the stock
market toward this problem domain. Furthermore, the only work done in this
space thus far has been either statistical in nature, or has been concerned
with classical machine learning models such as random forest trees. We propose
the novel application of two existing neural network architectures to this
problem domain and show that deep learning solutions can significantly
outperform all other existing pump and dump detection methods for
cryptocurrencies.
Crypto Pump and Dump Detection via Deep Learning Techniques
深層学習技術による暗号ポンプとダンプ検出
0.80
Viswanath Chadalapaka
Viswanath Kōr
0.27
Kyle Chang Gireesh Mahajan
カイル・チャン Gireesh Mahajan
0.42
Anuj Vasil vchad@usc.edu
Anuj vchad@usc.edu
0.29
krchang@usc.edu
krchang@usc.edu
0.39
gmahajan@usc.edu
gmahajan@usc.edu
0.39
avasil@usc.edu
avasil@usc.edu
0.39
2 2 0 2 y a M 0 1
2 2 0 2 y a m 0 1 である。
0.53
] G L . s c [
] G L。 sc [
0.47
1 v 6 4 6 4 0
1 v 6 4 6 4 0
0.43
. 5 0 2 2 : v i X r a
. 5 0 2 2 : v i X r a
0.42
Abstract Despite the fact that cryptocurrencies themselves have experienced an astonishing rate of adoption over the last decade, cryptocurrency fraud detection is a heavily underresearched problem area.
Of all fraudulent activity regarding cryptocurrencies, pump and dump schemes are some of the most common.
暗号通貨に関する不正行為の中で、ポンプとダンプのスキームが最も一般的である。
0.60
Though some studies have been done on these kinds of scams in the stock market, the lack of labelled stock data and the volatility unique to the cryptocurrency space constrains the applicability of studies on the stock market toward this problem domain.
Furthermore, the only work done in this space thus far has been either statistical in nature, or has been concerned with classical machine learning models such as random forest trees.
We propose the novel application of two existing neural network architectures to this problem domain and show that deep learning solutions can significantly outperform all other existing pump and dump detection methods for cryptocurrencies.
Introduction 1 In 2021 alone, cryptocurrencies (or “crypto”) experienced over $14 trillion worth of trading volume, representing nearly a 700% increase over the previous year.
As crypto and the use of these exchanges enters the mainstream, the conversation surrounding their regulatory hurdles has intensified (Cumming et al , 2019).
暗号通貨とこれらの取引所の利用が主流になるにつれ、規制上のハードルを取り巻く議論が激化している(Cumming et al , 2019)。
0.60
Of those hurdles, the detection of fraudulent activities at scale is one of the most pressing, given the rapid growth of the space.
これらのハードルの中で、空間の急速な成長を考えると、大規模な不正行為の検出は最も急務である。
0.57
Despite these concerns, the amount of regulation in crypto pales in comparison to more mainstream flows of money, such as stocks.
This allows fraud to plague the crypto space at a level unheard of in the stock market.
これにより、詐欺師は株式市場の無音レベルにおいて暗号通貨の空間を悩ませることができる。
0.59
Among all types of fraud in the crypto space, pump and dump (P&D) schemes are some of the most popular, and some of the easiest, to execute on (Twomey and Mann, 2020) – generally resulting from the concerted effort of just a single online P&D planning group (Hamrick et al , 2021).
暗号化空間におけるあらゆる種類の詐欺のうち、ポンプ・アンド・ダンプ(p&d)スキームは最も人気があり、最も簡単なもの(twomey and mann, 2020)の1つであり、一般的には1つのオンラインp&dプランニンググループ(hamrick et al, 2021)の協力によるものである。
0.72
Despite that, no studies have been done to-date on the application of deep learning to P&D detection in crypto.
それにもかかわらず、暗号における深層学習のP&D検出への応用についてはこれまで研究されていない。
0.68
Though some studies involving the application of deep learning have been done in the context of traditional securities such as stocks (Leangarun et al , 2018), these studies not only lacked the amount of data now freely available for crypto (via the blockchain), but also have no guarantee of applicability to the much more volatile world of crypto.
ディープラーニングの適用に関するいくつかの研究は、株式などの従来の証券(Leangarun et al , 2018)の文脈で行われているが、これらの研究は、現在(ブロックチェーンを介して)暗号化で自由に利用できるデータ量だけでなく、より不安定な暗号の世界への適用性の保証もない。
0.74
This paper presents two novel applications of existing deep learning methods to detect P&D schemes in crypto – specifically, for small, volatile cryptocurrencies also known as altcoins.
Our work focuses on taking advantage of the bulk of freely available data by using deep learning to drive performance gains, since previous works in this space have thus far only applied either classical machine learning techniques such as random forest trees (La Morgia et al , 2020), or more basic techniques such as statistic analyses (Kamps and Kleinberg, 2018) to the problem.
この分野でのこれまでの研究は、ランダムな森林木(La Morgia et al , 2020)のような古典的な機械学習技術や、統計解析(Kamps, Kleinberg, 2018)のようなより基本的な技術にしか適用されていない。 訳抜け防止モード: 私たちの研究は、ディープラーニングを使ってパフォーマンス向上を促進することで、無償で利用可能な大量のデータを活用することに重点を置いています。 この分野でのこれまでの研究は、ランダムフォレストツリー(la morgia et al, 2020)のような古典的な機械学習技術のみを適用してきた。 あるいは、問題に対する統計分析(kamps and kleinberg, 2018)のようなより基本的なテクニック。
0.61
All of our work is reproducible, and a link to our code can be found in the footnote below.
私たちの作業はすべて再現可能で、コードへのリンクは下記の脚注に記載されています。
0.74
1 2 Related Work 2.1 Application of Classical Machine Learning and Statistical Models to Crypto P&D Detection
1 2 関連作業 2.1 古典的機械学習と統計モデルの暗号p&d検出への応用
0.65
To-date, only classical machine learning models such as random forest trees have been applied in the
これまで、ランダム林木などの古典的機械学習モデルのみが適用されてきた。
0.79
1https://github.com/ Derposoft/crypto_
1https://github.com/ Derposoft/crypto_
0.17
pump_and_dump_with_d eep_learning
pump_and_dump_with_d eep_learning
0.10
英語(論文から抽出)
日本語訳
スコア
Figure 1: An example of an organized P&D scheme on the REAP token.
図1:REAPトークン上の組織化されたP&Dスキームの例。
0.76
Red stars indicate visual signs of the accumulation phase in the form of subtle buying pressure, while the blue star indicates the pump and dump phases of the scheme.
赤色の星は微妙な購入圧力の形で蓄積段階の視覚的な兆候を示し、青色の星はポンプとダンプ段階を示す。
0.73
This data was manually obtained directly from the KuCoin exchange (https://www.kucoin. com/trade/REAP-USDT) , although the same information can be found on other exchanges as well.
Publicly-available videos on YouTube can confirm that this fluctuation in the REAP token’s price was a result of a planned P&D operation (Coffeezilla, 2021).
As these are the only works that have been done in this niche, we hope that our own work in this paper will help build a better foundation in this domain going forward.
Anomaly Detection Anomaly detection in a general setting unrelated to crypto is a well-researched field.
異常検出 暗号とは無関係な一般設定における異常検出は、よく研究された分野である。
0.56
More specifically, studies involving the application of deep learning to time-series anomaly detection problems have been done involving multiple network architectures: LSTMs (Malhotra et al , 2016), convolutional networks (Kwon et al , 2018), and various combinations of the two (Kim and Cho, 2018); and more recently, multiple variations of attentionbased methods such as RNN attention (Brown et al , 2018) or the anomaly transformer (Xu et al , 2021) have also been explored.
より具体的には、深層学習の時系列異常検出問題への適用を含む研究は、複数のネットワークアーキテクチャを含む。LSTMs (Malhotra et al , 2016), convolutional network (Kwon et al , 2018), and various combinations of the two (Kim and Cho, 2018), さらに最近では、RNN attention (Brown et al , 2018), or the anomaly transformer (Xu et al , 2021)といった注目ベースの手法の多種多様なバリエーションも検討されている。
0.86
As of writing this paper, deep learning architectures are at the forefront of
この論文の執筆時点では、ディープラーニングアーキテクチャは最先端にある。
0.62
time-series anomaly detection due to their strength in making predictions using spatiotemporal relationships, which are key to strong anomaly detection models (Kim and Cho, 2018).
時空間的関係を用いた予測の強みによる時系列異常検出は,強い異常検出モデル(kim and cho, 2018)の鍵となる。
0.81
In this work, we implement, modify, and tune some of these latest architectures in an attempt to adapt them to the crypto domain.
This opens the opportunity for various P&D groups to openly plan P&D events over online messaging platforms such as Telegram or Discord (La Morgia et al , 2020).
これにより、さまざまなp&dグループがtelegramやdiscord(la morgia et al , 2020)といったオンラインメッセージングプラットフォーム上でp&dイベントをオープンに計画する機会が開かれる。
0.77
Typically, a P&D scheme consists of three phases: the accumulation phase, the promotion (or “pump”) phase, and the distribution (or “dump”) phase.
During the accumulation phase, the group organizing the scheme slowly accumulates a significant position in the asset of interest.
蓄積フェーズの間、スキームを組織するグループは徐々に関心の資産において重要な位置を蓄積する。
0.74
Once ready, the promotion phase begins: excitement is drummed up via social media promotion for the asset, and bullish sentiment is falsified through fraudulent re-
ports. This causes retail investors to rally behind and invest in the asset, thereby driving the price up.
港だ これにより、小売投資家は上昇して資産に投資し、それによって価格が上昇する。
0.58
Finally, during the distribution phase, the perpetrator of the scheme liquidates their position in the asset over a very short period of time.
最後に、分配フェーズにおいて、スキームの加害者は、非常に短期間に資産におけるその位置を清算する。
0.60
Since the position being liquidated generally represents a significant portion of the asset itself, this inevitably causes a crash in the price of the original asset – leaving most of the retail investors who bought during the promotion phase at a significant loss on their positions over an extremely short time.
The graph in Fig 1 is an example of a real P&D scheme on the REAP token which occurred near the start of April 2021.
図1のグラフは、2021年4月の初めに起こったREAPトークン上の実際のP&Dスキームの例である。
0.73
In this case, the entirety of the pump and dump phases lasted only a few minutes, and the accumulating party dumped all of their coins as soon as the value of the token shot up.
Investors who bought during the peak of the pump would have been down nearly 90% in the span of just a few minutes.
ポンプのピーク時に購入した投資家は、わずか数分で90%近く減少しただろう。
0.65
Successful P&D detection algorithms have the potential to alert investors of P&D schemes before they occur, and could even enable regulatory bodies to police this fraudulent activity in more cost and time-effective manners.
Ultimately, this could help to establish crypto as a much more legitimate trading option, as well as save investors and exchanges millions from potential scams and fraudulent activity.
4 Method We propose the novel application of two architectures that have scored well on standard anomaly detection datasets to the now-burgeoning financial data also available in the cryptocurrency space: the C-LSTM model (Kim and Cho, 2018), and the Anomaly Transformer model (Xu et al , 2021).
4つの方法 我々は、暗号通貨の分野で現在急速に発展している金融データであるc-lstmモデル (kim and cho, 2018) とアノマリートランスフォーマーモデル (xu et al, 2021) に対して、標準異常検出データセットでよく評価された2つのアーキテクチャの新規な適用を提案する。
0.66
Since P&D schemes often have multiple phases that generally occur over vastly different lengths of time (e g the accumulation phase may last for up to a month, whereas the pump or dump phases can last for as little as a minute), both of the models that we have chosen have the ability to capture both longer-term anomalies – otherwise known as “trend” anomalies – and much shorter-term anomalies – otherwise known as “point” anomalies.
This capability is important for the detection of P&D schemes, since models that are only capable of one or the other could potentially be fooled by the volatility inherent in crypto markets.
4.1 Models 4.1.1 C-LSTM Our first model is the C-LSTM model, originally introduced by (Kim and Cho, 2018) for learning anomaly detection by treating data as spatiotemporal in nature.
4.1 Models 4.1.1 C-LSTM 我々の最初のモデルはC-LSTMモデルです。 訳抜け防止モード: 4.1モデル 4.1.1 C - LSTM 最初のモデルはC - LSTMモデルです。 もともとは(Kim and Cho, 2018 )、自然の時空間としてデータを扱うことによって異常検出を学習するために導入された。
0.59
The model consists of a series of convolutional/ReLU/p ooling layers to encode the input sequence, followed by a set of LSTM layers, with decoding done via a set of feedforward layers.
In this model, convolutional layers help to capture spatial information within the dataset, thereby helping to identify point anomalies; on the other hand, the LSTM layers help to capture temporal information, and help to identify trend anomalies.
This simple model has shown success in the detection of varying types of web traffic anomalies (Kim and Cho, 2018), as well as anomalies across a number of stocks in the Chinese stock market (Yang et al , 2020).
この単純なモデルは、さまざまな種類のウェブトラフィック異常の検出(kim and cho、2018年)や、中国株式市場における多数の株式異常(yang et al、2020年)に成功している。
0.68
The following image is a visual representation of the C-LSTM model.
以下の画像はC-LSTMモデルの視覚表現である。
0.86
Our Implementation Our own model uses the following architecture: 1 set of convolutional/ReLU/p ooling layers with a convolution kernel size of 3 with a stride of 1, and a pooling kernel size of 2 with a stride of 1; 1 LSTM layer with an embedding dimension of 350; 1 feedforward layer which directly projects the last hidden state of the LSTM to a dimension of 1; and a sigmoid layer which constrains the output of our classifier between 0 and 1.
4.1.2 Anomaly Transformer The second model we explored was the Anomaly Transformer introduced by (Xu et al , 2021).
4.1.2 anomaly transformer 2番目のモデルは(xu et al, 2021)で導入されたanomaly transformerである。
0.70
Unlike a standard transformer, the Anomaly Transformer uses a custom anomaly attention module to improve its performance in anomaly detection scenarios.
To that end, the model introduces two novelties: an anomaly attention module, and a minimax optimization strategy.
このモデルには、異常注意モジュールとミニマックス最適化戦略という2つの新しい特徴が導入されている。
0.73
Additionally, on top of this model, we introduced our own, original novelty
さらに、このモデルの上に、私たちは独自のオリジナルノベルティを導入しました。
0.64
英語(論文から抽出)
日本語訳
スコア
phase, the series association is moved towards the original input sequence by taking the absolute difference of the two.
位相、系列は2つの絶対差を取ることにより、元の入力シーケンスに向かって移動される。
0.72
The series association is also enlarged using the prior association (once again through the symmetric KL divergence formula).
級数アソシエーションは、事前のアソシエーション(再び対称KL発散式を通して)を用いて拡大される。
0.60
Therefore, during the minimize phase, the prior association approximates the series association.
したがって、最小位相の間、先行関係は級数関係に近似する。
0.57
Then, during the maximize phase, the series association pays more attention to non adjacent points since it is enlarged by the prior association.
そして、最大位相において、系列連合は、先行結合によって拡大されるため、非隣接点により注意を払う。
0.70
As a result, a sequence generated by prior and series associations will lower the attention value at any anomaly.
その結果、前および後続の関連によって生成されるシーケンスは、任意の異常時に注意値を下げる。
0.71
The total amount of difference between the two associations captured during this process is encapsulated by the Association Discrepancy function, denoted as AD.
The data is then reconstructed after the minimax optimization via multiplying the series association Sl with the value matrix standard to an attention module, V .
– an altered optimization strategy and loss function – in order to adapt the model to a supervised setting, since this model was first developed and intended for unsupervised settings.
The Anomaly Transformer model was chosen because it is the current state-of-the-art anomaly detection model for general time-series anomaly detection (Xu et al , 2021), consistently achieving strong results across a variety of standard datasets, including server sensor data (Su et al , 2019), rover data from NASA (Keogh et al , 2021), and the NeurIPS 2021 timeseries benchmark (Lai et al , 2022).
anomaly transformerモデルは、一般的な時系列異常検出のための現在の最先端の異常検出モデル(xu et al , 2021)であり、サーバセンサデータ(su et al , 2019)、nasaのローバーデータ(keogh et al , 2021)、neurips 2021 時系列ベンチマーク(lai et al , 2022)など、さまざまな標準データセットで一貫して強力な結果を達成している。
0.81
The first of the two novelties introduced by the Anomaly Transformer which makes it particularly good at detecting anomalies is the replacement of the standard self-attention computation with two, internally-computed association values that influence the attention block: series association and prior association.
These two associations make up the “anomaly attention module” and are as described below:
これら2つの協会は、「異常注意モジュール」を構成し、以下のとおりである。
0.66
1. Series association at each layer l, denoted Sl, is a simple self-attention computation on the data before multiplying by the value matrix V which is standard to a normal attention mechanism.
This process consists of two phases: minimize and maximize.
この過程は、最小と最大の2つの相からなる。
0.63
During the minimize phase, the series association is moved towards the prior association using the symmetric KL divergence formula, denoted SKL.
最小位相の間は、SKLと表される対称KL発散公式を用いて、系列結合を先行結合に移動させる。
0.71
During the maximize Combining these novelties means that the reconstructed series, denoted ˆX, will greatly differ from the original series if an anomaly is present, making the discrepancy greater and giving the model an easier time to identify anomalies.
After integrating in the reconstruction loss, the losses derived from each of these two phases are summarized through the following equations, where λ > 0 and Ltot represents the total loss:
F − λ(cid:107)AD(P, S; X)(cid:107) SKL(A, B) = KL(A(cid:107)B) + KL(B(cid:107)A)
F −λ(cid:107)AD(P, S; X)(cid:107)SKL(A, B) = KL(A(cid:107)B) + KL(cid:107)A)
0.45
AD(P, S; X) =
AD(P, S; X) =
0.43
SKL(P l i,:, Sl
SKL(Pl) i,:,Sl
0.34
i,:)
i (複数形 is)
0.30
L(cid:88) 1 L
l(cid:88) 1L
0.38
l=1 ˆX = SlV
l=1 >X = SlV
0.35
Our Implementation Since our own problem was supervised, we modified the optimization strategy and loss function used, since the Anomaly Transformer was originally built for unsupervised data.
Since cryptocurrency P&D detection is an anomaly detection problem, we have chosen not to collect the accuracy of our model, because accuracy measurements do not reflect the strength of a model in anomaly detection scenarios.
Specifically, since anomaly detection scenarios have a large class imbalance heavily favoring negative labels, the accuracy of any model is almost always over 99.9% even if it just outputs 0 (the negative class).
The precision is defined as the percentage of predicted anomalies which were correctly classified, and the recall is defined as the percentage of actual anomalies that were correctly classified as anomalies.
The F1 score is the harmonic mean of these two values.
F1スコアは2つの値の調和平均である。
0.73
Given that both the precision and recall are important in the context of anomaly detection, it stands to reason that the F1 score, which is a mix of the two values, is a good metric with which to compare two different models at the end of the day.
The following equations are used to calculate the precision, recall, and F1 scores respectively:
以下の式は、それぞれ精度、リコール、F1のスコアを計算するために用いられる。
0.69
P recision =
p のレゾクション =
0.66
Recall = NT P
リコール= nt p である。
0.45
NT P + NF P
NT P + NF P
0.42
NT P NT P + NF N
nt p である。 NT P + NF N
0.36
F 1 = 2 ∗ P recision ∗ Recall
F 1 = 2 ∗ P recision ∗ Recall
0.42
P recision + Recall
P recision + Recall
0.42
where the terms NT P , NF P , and NF N refer to the number of True Positives, False Positives, and False Negatives respectively.
NT P 、NF P 、NF N はそれぞれ正の正の個数、偽の正の個数、偽の負の個数を指す。
0.62
After training each model, we choose the metrics corresponding to the epoch in which the highest validation F1 score was obtained instead of the metrics from the final epoch.
5 Experiments 5.1 Dataset The dataset used in this paper consists of manuallylabelled, raw transaction data from the Binance cryptocurrency exchange, first introduced by (La Morgia et al , 2020).
5 Experiments 5.1 Dataset この論文で使用されるデータセットは、最初に紹介された(La Morgia et al , 2020)Binance暗号取引所の生の取引データを手動で記述したものです。
0.68
The transactions of various cryptocurrencies that experienced known occurrences of P&D were collected.
既知のP&Dの発生を経験した様々な暗号通貨の取引を収集した。
0.68
To generate the dataset, the authors first joined several cryptocurrency P&D Telegram groups that were well-known for planning and executing on P&D schemes.
Then, over a period of two years, the researchers collected the timestamps for official ”pump signals” which were announced in each of these groups by group administrators.
the pumped cryptocurrency for up to 1 week preceding and succeeding the pump, depending on what was available to access.
この暗号通貨は、アクセス可能なものに応じて、最大1週間までポンプの前と後継に投下された。
0.67
In this fashion, 343 P&D occurrences’ worth of data was collected.
この方法では、343件のP&D事象のデータを収集した。
0.74
After collecting this raw data from the Binance API, the authors further preprocessed the data by aggregating transactions into 5-second, 15-second, and 25-second “chunks”, thereby forming three different aggregated datasets.
Each of these aggregated datasets, in turn, contains the following 15 features:
これらの集計データセットはそれぞれ、以下の15の機能を含んでいる。
0.75
1. Date, HourSin, HourCos, MinuteSin, MinuteCos: The date and hour- and minute-based positional encoding of a given chunk.
1.日付、時間、時間、分単位、分単位:所定のチャンクの時間と分単位の位置エンコーディング。
0.62
2. PumpIndex, Symbol: The 0-based index of the pump, numbering it out of the 343 available pumps, and the ticker symbol of the coin on which the pump took place.
6. StdPrice, AvgPrice, AvgPriceMax: The moving standard deviation, average percent change, and average maximum percent change in the price of the asset.
Implementation Details 5.2 Given an input data sequence X = X1, ..., XN , our first step splits the data into a separate train and validation set using an 80:20 ratio without shuffling.
実施内容 5.2 入力データ列 X = X1, ..., XN が与えられると、最初のステップでは、シャッフルなしで80:20の比率で、データを別々の列車と検証セットに分割します。
0.54
Following this split, train data undergoes the following preprocessing steps: First, data is broken into M contiguous subsequences X = Y1, .
この分割の後、トレインデータは以下の前処理ステップを実行する: まず、データはMの連続列 X = Y1, に分割される。
0.73
., YM , where Yi corresponds to the ith pump in the dataset as determined by the PumpIndex feature.
Yiは、PumpIndex機能によって決定されたデータセットのithポンプに対応するYMである。
0.64
Separating the data out by pump at this stage ensures that during training time, models aren’t fed information from 2 different pumps at the same time, which may hinder their training process.
Of these pumps, all pump sequences Yi with fewer than 100 chunks are discarded, since pumps that only have a few chunks do not make for exemplary training
samples. This leaves us with a training subset of m < M pump sequences.
サンプル これにより、m < m ポンプシーケンスのトレーニングサブセットが得られる。
0.59
Then, we further prepare the data for each of these pumps by splitting them into segments of size s via taking a sliding window over the chunks of each pump Yi and adding reflection padding to the start of size s − 1 so that the resulting count of windows is equal to the original count of chunks.
Together, the chunks in each window are considered to be one “segment.”
同時に、各ウィンドウのチャンクは1つの「セグメント」と見なされる。
0.74
In this scheme, segments are inputs to our models.
このスキームではセグメントは我々のモデルに入力される。
0.72
The models then predict the probability of a pump occurring during the last chunk of the segment.
モデルでは、セグメントの最後のチャンクでポンプが発生する確率を予測する。
0.75
At this point, the segments from all pumps can once again be safely shuffled and batched together without fear of information from one pump “leaking” into the training data for another pump.
We choose to segment our data for several reasons.
私たちはいくつかの理由でデータをセグメント化することにしました。
0.47
First, segmenting and predicting in this way avoids the possibility for models to make predictions based on future values – in other words, in our setup, models can only predict whether or not a pump will occur based on currently-known information.
This ensures that models trained under this scheme have the capability to be deployed in a real-world, realtime anomaly detection scenarios on real exchanges.
Having a fixed segment length as opposed to the variable lengths of pump data also simplifies data loading.
ポンプデータの可変長に対して固定セグメント長を持つことも、データのロードを単純化する。
0.82
Finally, segmenting our data into chunks makes it straightforward to perform undersampling on our data, which has been shown to boost model performance in anomaly detection scenarios by addressing the class imbalance – even in cases where the reduction does not bring the class ratio all the way down to 50:50 (Hasanin and Khoshgoftaar, 2018).
Undersampling also naturally speeds up training, since it reduces the number of training samples.
トレーニングサンプルの数を減らすため、アンダーサンプリングは自然にトレーニングを高速化する。
0.76
We therefore apply undersampling to our train data by choosing to keep only a random subset of proportion u of all segments that do not contain any anomaly labels.
This avoids unfairly skewing metrics in favor of our models.
これにより、私たちのモデルに有利な指標を不当に求めることは避けられます。
0.44
Via hyperparameter tuning, we found that a segment length value of s = 15 resulted in the best validation performance as measured by F1 score.
過パラメータチューニングにより, s = 15 のセグメント長値がF1 スコアで測定された結果, 最高の検証性能が得られた。
0.81
An undersampling proportion of u = 0.05 was used for all experiments except for the C-LSTM on the 15-second chunked dataset, in which u = 0.1 was used.
u = 0.05 のアンダーサンプリング比は、15秒のチャンクデータセット上の C-LSTM を除いて全ての実験で使われ、u = 0.1 であった。
0.73
For the C-LSTM model, batch sizes of 1200, 600, and 600 were used on the 5-second, 15-second, and 25-second chunked datasets respectively.
Likewise, the C-LSTM model uses precision-recall thresholds of 0.5, 0.4, and 0.65 on each the three datasets respectively, while the Anomaly Transformer model uses a threshold of 0.48 on all datasets.
In order to keep results consistent across all models, baseline results were also recomputed following the same train-validation split as our own models.
5.3 Baselines The baseline results that we use to compare our models against are those of the random forest model employed by (La Morgia et al , 2020).
5.3 ベースライン モデルを比較するために使用するベースライン結果は、(la morgia et al , 2020)採用のランダムフォレストモデルです。
0.66
For more context, the results of the statistical model introduced by (Kamps and Kleinberg, 2018) are also included in our results table, since they were used as a baseline for the random forest model originally.
より詳しくは、(kamps and kleinberg, 2018)によって導入された統計モデルの結果も、もともとランダムフォレストモデルのベースラインとして使われていたため、結果表に含まれています。
0.69
The metric that we will use to compare our models to the baseline results will be the aforementioned F1 score of each experiment.
5.4 Results Both deep learning models were able to beat all previous classical and statistical approaches by a statistically significant margin (p < 0.025) except for the C-LSTM model on the 15-second chunked dataset, which matched previous results.
This demonstrates the effectiveness of deep learning in this previously unexplored area.
これは、この未調査領域におけるディープラーニングの有効性を示す。
0.48
State-of-the-art results across all models are bolded in our table.
すべてのモデルの最先端の結果は、私たちの表に大胆に書かれています。
0.45
On average, we found that predictions using the 5-second chunked dataset are much less accurate than those on the 15-second and 25-second chunked dataset, which suggests that predicting anomalies using smaller chunk sizes corresponds to a harder problem in general.
Variance is measured after our efforts to maintain reproducability via setting seeds.
種子設定による再現性維持への取り組みの後に変動を測定する。
0.65
The random forest (RF) model’s lack of variance is due to our setting seeds in a similar manner to (La Morgia et al , 2020).
ランダム・フォレスト(rf)モデルによる分散の欠如は、私たちの種子の設定が(la morgia et al , 2020)に類似しているためである。
0.78
Since the Kamps model is statistical in nature, it does not have variance across runs.
カンプスモデルは本質的に統計的であるため、ラン間のばらつきは持たない。
0.67
A comparison of the two deep learning models that we have employed shows that the transformer was able to beat the LSTM in all but the 25-second chunk size.
This is unsurprising for two reasons: firstly, transformers are state-of-the-art for timeseries anomaly detection problems to begin with, and secondly, a basic transformer design without the use of the Anomaly Transformer novelties already outperforms the random forest model results – albeit just barely.
We believe that the theoretical backing for the relative performance of the LSTM and Anomaly Transformer lies in the fact that for 5-second and 15-second chunk sizes, longer sequences are required to capture the same amount of information on average when compared to the larger 25-second chunks.
Consequently, this implies a drop in the amount of temporal information present in the input for smaller chunk sizes, given a fixed segment length s = 15 as was used in this paper.
したがって,本論文で使用する固定セグメント長 s = 15 に対して,より小さなチャンクサイズの入力に対して,入力に存在する時間情報量が減少することを意味する。
0.85
We hypothesize that this difference is responsible for the LSTM’s faster drop in efficacy at smaller chunk sizes.
この差がLSTMのより小さなチャンクサイズでの有効性の低下の原因であると仮定する。
0.67
However, more experiments must be done to reveal our models’ sensitivity to the aggregation chunk size of the dataset before confirming this.
This paper studies the application of deep learning models to the crypto fraud detection problem space.
本稿では,暗号不正検出問題に対するディープラーニングモデルの応用について検討する。
0.60
We propose two separate models that can each reach state-of-the-art performance on the data available: the C-LSTM, and the Anomaly Trans-
我々は,c-lstm と anomaly trans の2つの異なるモデルを提案する。
0.36
英語(論文から抽出)
日本語訳
スコア
former. Our results consequently show that both LSTMs and Transformers have the ability to outperform both classical ML models and statistical models on this dataset with relatively little effort.
Future work includes fine-tuning these models to better-account for the volatility generally found in crypto, and exploring the potential for these models to detect pumps ahead of time as opposed to when they begin.
Furthermore, the dataset provided to the community by (La Morgia et al , 2020), while incredibly useful, has been preprocessed to be optimized in some ways for use by their random forest model; for instance, an LSTM could make use of the actual price of an asset (as opposed to the average percent change in price) in ways that a random forest model cannot.
さらに、(la morgia et al , 2020)によってコミュニティに提供されたデータセットは、驚くほど有用であるが、ランダムフォレストモデルによって使用するためにいくつかの方法で最適化されるように事前処理されている。 訳抜け防止モード: さらに、コミュニティに提供されたデータセット(la morgia et al, 2020)。 驚くほど便利ですが 事前に処理されています ランダムな森林モデルで 利用するために最適化されています 例えば、lstmは、ランダムな森林モデルではできない方法で、資産の実際の価格(価格の変化の平均パーセントとは対照的に)を利用することができる。
0.77
Therefore, the release of the raw dataset, if possible, would push the capabilities of deep learning models in this domain even further.
2016. Lstm-based encoder-decoder for multi-sensor anomaly detection.
2016. lstmを用いたマルチセンサ異常検出用エンコーダデコーダ
0.54
arXiv preprint arXiv:1607.00148.
arXiv preprint arXiv:1607.00148
0.36
Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, and Dan Pei.
Ya Su、Youjian Zhao、Chenhao Niu、Rong Liu、Wei Sun、Dan Pei。
0.67
2019. Robust anomaly detection for multivariate time series through stochastic recurrent neural network.
2019. 確率的リカレントニューラルネットワークによる多変量時系列のロバスト異常検出
0.60
In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, page 2828–2837, New York, NY, USA.
The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, page 2828–2837, New York, NY, USA. 訳抜け防止モード: 第25回ACM SIGKDD国際知識発見・データマイニング会議に参加して KDD ’19 page 2828–2837, New York, NY, USA.
0.80
Association for Computing Machinery.
アソシエーション・フォー・コンピューティング・マシンズ(Association for Computing Machinery)の略。
0.36
David Twomey and Andrew Mann.
デビッド・ツーミーとアンドリュー・マン
0.60
2020. Fraud and manipulation within cryptocurrency markets.
2020. 暗号通貨市場における不正と操作。
0.52
Corruption and fraud in financial markets: malpractice, misconduct and manipulation, 624.
金融市場の腐敗と不正:不正行為,不正行為,操作,624。
0.33
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 訳抜け防止モード: ashish vaswani, noam shazeer, niki parmar, jakob uszkoreit, リオン・ジョーンズ、エイダン・n・ゴメス、ルカシュ・カイザー、イリア・ポロスクヒン。
0.52
2017. Attention is all you need.
2017. 注意はあなたが必要とするすべてです。
0.53
CoRR, abs/1706.03762.
CoRR, abs/1706.03762。
0.58
Jiehui Xu, Haixu Wu, Jianmin Wang, and Mingsheng Long.
Jiehui Xu、Haixu Wu、Jianmin Wang、Mingsheng Long。
0.66
2021. Anomaly transformer: Time series anomaly detection with association discrepancy.
2021. 異常変圧器 : 時系列異常検出と相関異常
0.54
CoRR, abs/2110.02642.
コラー、abs/2110.02642。
0.32
Wenjie Yang, Ruofan Wang, and Bofan Wang.
ウェンジー・ヤン、ルーファン・ワン、ボファン・ワン。
0.37
2020. Detection of anomaly stock price based on time series deep learning models.
2020. 時系列深層学習モデルに基づく異常株価の検出
0.53
In 2020 Management Science Informatization and Economic Innovation Development Conference (MSIEID), pages 110–114.