Predicting the duration of traffic incidents is a challenging task due to the
stochastic nature of events. The ability to accurately predict how long
accidents will last can provide significant benefits to both end-users in their
route choice and traffic operation managers in handling of non-recurrent
traffic congestion. This paper presents a novel bi-level machine learning
framework enhanced with outlier removal and intra-extra joint optimisation for
predicting the incident duration on three heterogeneous data sets collected for
both arterial roads and motorways from Sydney, Australia and San-Francisco,
U.S.A. Firstly, we use incident data logs to develop a binary classification
prediction approach, which allows us to classify traffic incidents as
short-term or long-term. We find the optimal threshold between short-term
versus long-term traffic incident duration, targeting both class balance and
prediction performance while also comparing the binary versus multi-class
classification approaches. Secondly, for more granularity of the incident
duration prediction to the minute level, we propose a new Intra-Extra Joint
Optimisation algorithm (IEO-ML) which extends multiple baseline ML models
tested against several regression scenarios across the data sets. Final results
indicate that: a) 40-45 min is the best split threshold for identifying short
versus long-term incidents and that these incidents should be modelled
separately, b) our proposed IEO-ML approach significantly outperforms baseline
ML models in $66\%$ of all cases showcasing its great potential for accurate
incident duration prediction. Lastly, we evaluate the feature importance and
show that time, location, incident type, incident reporting source and weather
at among the top 10 critical factors which influence how long incidents will
last.
Incident duration prediction using a bi-level machine learning framework with outlier removal and intra-extra joint optimisation Artur Grigoreva,ࢩ,1, Adriana-Simona Mihaitaa, Seunghyeon Leea and Fang Chena
aUniversity of Technology Sydney, 61 Broadway Str, Sydney, Australia ARTICLE INFO ABSTRACT Predicting the duration of traffic incidents is a challenging task due to the stochastic nature of Keywords: incident duration prediction events.
aUniversity of Technology Sydney, 61 Broadway Str, Sydney, Australia ARTICLE INFO ABSTRACT Predicting the duration of traffic incidents is a challenge task because the stochastic nature of Keywords: incident duration prediction events。 訳抜け防止モード: auniversity of technology sydney, 61 broadway str, sydney, australia article info abstract (英語) 交通事故発生期間の予測 キーワードの確率的性質による課題 : インシデント持続時間予測イベント
0.66
The ability to accurately predict how long accidents will last can provide significant benefits to both end-users in their route choice and traffic operation managers in handling of arterial road versus motorways incinon-recurrent traffic congestion.
This paper presents a novel bi-level machine learning framedent management classification work enhanced with outlier removal and intra-extra joint optimisation for predicting the incident regression duration on three heterogeneous data sets collected for both arterial roads and motorways from machine learning Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term extreme-boosted decision-trees or long-term.
This paper presents a novel bi-level machine learning framedent management classification work enhanced with outlier removal and intra-extra joint optimisation for predicting the incident regression duration on three heterogeneous data sets collected for both arterial roads and motorways from machine learning Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term extreme-boosted decision-trees or long-term. 訳抜け防止モード: 本稿では, シドニー, オーストラリア, サンフランシスコから収集した動脈路および高速道路の3つの不均一なデータに対して, イントラジョイント回帰時間を予測するために, アウターリアー除去とイントラジョイント最適化により強化された新しいバイレベル機械学習フレームデント管理作業を提案する。 まず、インシデントデータログを用いてバイナリ分類予測手法を開発する。 これにより、トラフィックのインシデントを、短期間 - 極端な - 決定の促進 - 木や長い - - に分類することができます。
0.62
We find the optimal threshold between short-term versus long-term traffic incident light gradient boosting modelling intra-extra joint optimisation duration, targeting both class balance and prediction performance while also comparing the binary versus multi-class classification approaches.
Secondly, for more granularity of the incident duration prediction to the minute level, we propose a new Intra-Extra Joint Optimisation algorithm (IEO-ML) which extends multiple baseline ML models tested against several regression scenarios across the data sets.
a) 40-45 min is the best split threshold for identifying short versus long-term incidents and that these incidents should be modelled separately,
a) 40~45分は,短期的又は長期的インシデントを特定し,これらのインシデントを別々にモデル化すべき最善のスプリットしきい値である。
0.61
b) our proposed IEO-ML approach significantly outperforms baseline ML models in 66% of all cases showcasing its great potential for accurate incident duration prediction.
Lastly, we evaluate the feature importance and show that time, location, incident type, incident reporting source and weather at among the top 10 critical factors which influence how long incidents will last.
Note This document represents a pre-print version of the paper (before peer-review, version from 29 Jun 2021) submitted to the journal ”Transportation Research Part C: Emerging Technologies”.
なお、この文書は、同誌に提出された前版(2021年7月29日版)「Transportation Research Part C: Emerging Technologies」に記載されている。
0.76
During the peer-review the paper has been significantly extended (from 26 to 35 pages) and accepted for publication on 6 May 2022.
査読期間中、論文は26ページから35ページまで大幅に拡張され、2022年5月6日に出版された。
0.66
1. INTRODUCTION 1.1.
1. 導入 1.1。
0.64
Context Traffic congestion is a significant concern for many cities around the world.
コンテクスト交通渋滞は、世界中の多くの都市にとって重要な懸念事項である。
0.61
Congestion arises due to various factors,includinginc reasedpopulation,wor kforceconcentrationi ncentralareas,orthel ackofefficientpublictransport modes.
混雑は、increasedpopulation、workforceconcentrati onincentralareas、orthelackof efficientpublictrans port modeを含む様々な要因によって生じる。
0.40
Two forms of congestion are typically predominant:
概して2種類の混雑が主流である。
0.53
a) recurrent traffic congestion during peak hours when traffic demand exceeds the road capacity, and
a) 交通需要が道路容量を超えるピーク時の繰り返し交通渋滞
0.62
b) non-recurrent traffic congestion caused by unplanned events such as car accidents, breakdowns, weather, public manifestations etc.
Accurately predicting the total duration shortly after an incident took place could save operational costs and end-user time (through affecting the route planning).
Most prior studies related to this topic concentrated on testing different machine learning models on specific road types like freeways or highways and focused primarily on different phases of the incident duration such as clearance time, recovery time, and the total incident duration [26].
There is currently a lack of an advanced approach that can be applied on all road types, for all accident types and across various countries with different driving behaviour.
1.2. Challenges and contribution The accuracy of predicting the incident duration is often determined more by the modelling methodology, the feature construction, and the result interpretation rather than by the model in use.
Themajorityofpriorwo rkshadstudiedthepred ictionofincidentdura tiononspecific types of roads (freeways or motorways) [44]-[10]-[17]-[45], where the data accuracy is higher than on arterial roads; as of 2018, very few applied the prediction strategies on normal arterial roads due to the high modelling complexity and a location mismatching; the majority of traffic incident duration analysis researches focus only on one type of road network (freeways, highways, etc); this is revealed by a recent state-of-the-art papers published in [26] which emphasises the difficulty of solving this problem for arterial roads and the lack of studies in this field.
Themajorityofpriorwo rkshadstudiedthepred ictionofincidentdura tiononspecific types of roads (freeways or motorways) [44]-[10]-[17]-[45], where the data accuracy is higher than on arterial roads; as of 2018, very few applied the prediction strategies on normal arterial roads due to the high modelling complexity and a location mismatching; the majority of traffic incident duration analysis researches focus only on one type of road network (freeways, highways, etc); this is revealed by a recent state-of-the-art papers published in [26] which emphasises the difficulty of solving this problem for arterial roads and the lack of studies in this field. 訳抜け防止モード: the majorityofpriorworks hadstudiedthepredict ionofincidentduratio nonspecific types of road (freeways or motorways ) [ 44]-[10]-[17]-[45 ] 幹線道路よりもデータの正確さが ; 2018年現在,交通インシデント継続時間解析研究のほとんどが1種類の道路網(高速道路,高速道路)にのみ焦点をあてている。 高速道路など) ; これは、[26 ]で発行された、最近の状態 - of - the - art papers によって明らかにされる。 動脈道路におけるこの問題の解決の難しさと、この分野における研究の欠如を強調する。
0.70
Our study proposes a framework capable of predicting the incident duration regardless of the road network or its complexity.
本研究は,道路網やその複雑さに関わらず入射継続時間を予測できる枠組みを提案する。
0.72
Secondly, the majority of studies in the literature have concentrated on applying state-of-the-art machine learning models mostly for classifying the incident severity [35] or their duration[26].
However, very few have treated the problem of outliers or imbalanced data classes.
しかし、異常値や不均衡データクラスの問題を扱うケースはほとんどない。
0.71
Our study addresses both of these issues by proposing a varying threshold procedure that can facilitate binary duration classification threshold selection by considering both class balance and model performance.
We, on the contrary, test multiple different thresholds for three different data sets.
反対に、3つの異なるデータセットに対して複数の異なるしきい値をテストする。
0.77
Furthermore we propose our own optimisation approach which we denote intra-extra joint optimisation (IEO) together with an outlier removal procedure (ORM) and advanced machine learning modelling.
Thirdly, we further solving the incident duration regression problem and also perform different regression scenarios to test the extrapolation performance of ML models on various incident data sets.
We utilise thresholds selected during the classification threshold evaluation procedure to analyse the extrapolation performance by training ML modelsandmakingpredi ctionsonseveraldurat ionsubsets.
分類しきい値評価法で選択されたしきい値を用いて、MLモデルと製造予測を訓練し、外挿性能を解析する。
0.56
ItallowsustofindthebestMLmodelandt hebestextrapolation approach for regression problem on each duration subset (e g short-term incidents) of each data set.
itallowsustofindtheb estmlmodel andthebestextrapolat ion approach for regression problem on each duration subset (例えば、各データセットの短期インシデント)。
0.79
For the regression problem, we also detect the most influential factors that affect the incident duration that traffic centres need to prioritise in order to predict incident duration with higher accuracy.
The end goal is to improve the extrapolation ability of machine learning models on the task of incident duration prediction and find the best modelling approaches for short-term and long-term incidents.
Unfortunately, we show that the performance of ML models is highly affected by the data set and the chosen methodology: data quality, the available features, and the additional parameter tuning and optimisation techniques applied in this work.
Paper contributions: to the best of our knowledge, this is the first research study addressing these challenges and proposing a bi-level prediction framework using a large pallet of several machine learning models applied for both incident duration classification and regression, with the scope of predicting the incident duration on different road types across two different cities (Sydney, Australia, and San-Francisco, U.S.A.).
a)abinaryversusmulti -classclassificationmethodinordert ofindthebestoptimalthre sholdtoidentifyshort versus long-term incidents,
a) 短期的事件と長期的事件とをいう。
0.14
b) a novel IEO-ML algorithm which integrates baseline ML models with outlier removal and intra-extra joint optimisation techniques across the validation cycle,
b) バリデーションサイクル全体にわたって、外れ値除去とextraジョイント最適化技術を備えたベースラインmlモデルを統合する新しいieo-mlアルゴリズム
0.70
c) a detailed analysis on best scenarios to
c) 最善のシナリオに関する詳細な分析
0.87
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 2 of 27
27ページ。
0.50
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
train and test the models across all data sets, and d) a feature importance selection from the best performing model to showcase the most important factors affecting how long incidents will last on urban roads.
Overall, this research lays the foundation stone of bi-level predictive methodologies regarding the traffic incident duration and can provide accurate information for both the end-user route choice modelling as well as for the operational centres which need to optimise their operations under non-recurrent traffic congestion.
Moreover, this work contributes to our ongoing objective to build a real-time platform for predicting traffic congestion and to evaluate the incident impact during peak hours (see our previous works published in [33]-[37]-[32]).
The paper is organised as follows: Section 1 discusses related works, Section 2 presents the data sources available forthisstudy, Section3showcasesthe methodology, Section4presentsthen umericalresultsforbi naryandmulti-class classification tasks, Section 5 presents the numerical results of the regression part of the framework, Section 6 details on the feature importance evaluation and Section 7 is reserved for conclusions and future perspectives.
1.3. Related works Incident data interpretation: The definition of traffic incident duration phases is provided in the Highway Capacity Manual [2], and it consists of the following time-intervals:
1) incident detection time which is the time interval between the incident occurrence and its reporting,
1)インシデント発生と報告との間の時間間隔であるインシデント検出時間。
0.62
2) incident response time standing for the time interval between the incident reporting and the arrival of the first investigator at the location of the accident,
In our work, we use the term incident duration for the time lapse between the detection of an incident and the clearance of the incident, as officially reported in traffic logs provided by local traffic authorities.
However, differentphasesoftrafficincidentduration(e g clearance, recoverytime)canbemo delledindividuallyup on availability; this type of research is rare because of the complexity of data collection for traffic incidents and small amounts of recorded traffic incidents in real-life datasets [26, 2].
When it comes to the data interpretation in the literature, the incident duration distribution has been modelled as log-normal [39] and more recently as log-logistics distribution [11, 38].
In a recent study [15], incident clearance time and the total impact duration were modelled using Weibull, log-normal, log-logistic distributions and compared using the Akaike information criterion (AIC) criteria; findings have revealed that log-logistic distribution was outperforming other distributions.
最近の研究 [15] では、インシデントクリアランス時間とトータル衝撃時間は、weibull, log-normal, log-logistic distributionsを用いてモデル化され、acadeke information criterion (aic) 基準を用いて比較された。
0.69
As distribution utilisation is highly related to the specificity of each data set, for this study, in which we use three different data sets, we further apply a comparison among several distribution modelling choices by using the AIC criteria.
Machine Learning for incident duration prediction: While several statistical modelling techniques have been appliedpreviously,mo rerecently,newapproa chesinmachinelearnin g(ML)modellinghaveem ergedasamoreadvanced wayofpredictingthein cidentdurationduetot heircapacitytoeasily accountfornewdatasou rces, aswellasforremoving the linearity assumptions between features and the predicted class [18].
The recently proposed Gradient-Boosted Decision Trees (GBDTs) have been shown to provide superior prediction performance when compared to Random Forests, SVMs and ANNs [31].
最近提案されたGBDT(Gradient-Booste d Decision Trees)は,Random Forests,SVMs,ANNs[31]と比較して,優れた予測性能が得られた。
0.76
However, it is known that GBDT can easily over-fit when the prediction target has a long-tail distribution, as is the case of the traffic incident duration distribution [31].
XGBoost [7] is another decision-tree enhancement method that has gained popularity recently in the machine learning community due to its tree boosting capability, loss function regularisation and adaptive learning rate.
It was employed in several international competitions, winning 17 out of the 29 Kaggle competitions singled out on the 2015 Kaggle blog; it was also employed by every team in the top-10 in the 2015 KDDCup [3] for solving various problems such as store sales prediction, web text classification, hazard risk prediction, and product categorisation.
XGBoost’s popularity is also due to its scalability (it can run on a single machine, as well as on distributed and paralleled clusters), its capacity to handle sparse data and the ability to handle instance weights in approximate tree learning (see the recent paper published by [7] where authors proposed an end-to-end tree boosting system with cache-aware and sparsity learning features).
While each of these methods has its advantages and disadvantages, building a fast and reliable prediction framework that could be applied for real-time
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
operations represents a true challenge.
運用は真の課題である。
0.56
One of the recent research studies [22] presented a two-step approach for traffic incident duration prediction.
最近の研究の1つ[22]は、交通事故発生期間予測のための2段階のアプローチを示した。
0.67
A cost-sensitive Bayesian network was used to perform binary classification of traffic incidents by choosing a threshold of 30 minutes and then performing regression for each class using kNN.
While the approach is functional, one major drawback for the classification problem is to manually choose the class split threshold, as it can lead to severe class imbalance; to overcome this issue, in our study, we perform both a fixed and a varying threshold set-up to find the best class balance for our classification models; even-more, we propose as well a comparison with a multi-class classification approach and debate on the benefits and drawbacks of using classifiers for such problems; we also enhanced more advanced regression models together with outlier removal procedures that would provide a better and more precise prediction of the incident duration precondition in minutes.
While the approach is functional, one major drawback for the classification problem is to manually choose the class split threshold, as it can lead to severe class imbalance; to overcome this issue, in our study, we perform both a fixed and a varying threshold set-up to find the best class balance for our classification models; even-more, we propose as well a comparison with a multi-class classification approach and debate on the benefits and drawbacks of using classifiers for such problems; we also enhanced more advanced regression models together with outlier removal procedures that would provide a better and more precise prediction of the incident duration precondition in minutes. 訳抜け防止モード: アプローチは機能的であるが、分類問題の大きな欠点は1つである。 手動でクラス分割しきい値を選択する この問題を克服するためです 我々の研究では、固定閾値と可変しきい値の両方を上向きに実行します。 分類モデルに最適なクラスバランスを 探すために さらに我々は,複数クラス分類手法との比較や,そのような問題に対する分類器の使用のメリットと欠点に関する議論も提案している。 数分で事故発生期間を より正確に予測できるでしょう
0.72
Overall, the cost sensitivity of incorrect classification can be further extended to the cost-based regression metrics.
全体として、誤った分類のコスト感受性は、コストベースの回帰指標にさらに拡張することができる。
0.60
We propose our enhanced ML models with a proposed intra and extra joint optimisation technique and outlier removal procedure to have even more precise predictions.
我々は,より正確な予測を行うために,関節内最適化法と外乱除去法を併用した拡張MLモデルを提案する。
0.75
In one of the recent research studies on applying machine learning, which was related to the classification of driving state, multiple hyper-optimised ML models were tested, and entire feature space was visualised using t-SNE for entire feature space visualisation [43].
RandomForest provided the highest prediction accuracy, but more advanced tree-based models exist that utilise gradient boosting, which we will be using in our research (e g gradient boosted decision trees).
By performing a feature importance analysis, we can recommend traffic management facilities to record the most critical data and omit redundant data related to traffic incidents.
Also, we can increase the precision of specific observations (e g weather condition), which were found to play a significant role in some research studies (e g during summer and autumn seasons, response team preparation time was higher on freeways in Washington, USA in 2009 [19], with no noticeable effect on clearance and response team travel time.
Peak hours were the most influencing feature on response team preparation delay, which was found to be linked to response procedures (the goal of the response team was to resolve incidents during peak hours as soon as possible).
A research study using Beijing traffic incidents data from 2008 [27] found the importance of "peak hour" value for the response team travel time and clearance time, but not for the intervention team preparation time.
Our study conducts a feature importance ranking based on the best performing ML models we have proposed and provides a detailed overview of their impact.
本研究は,提案した最高のMLモデルに基づいて,特徴重要度ランキングを実施し,その影響を詳述する。
0.78
Different approaches to feature importance estimation use tree-based models (e g Random Forest, LightGBM, XGBoost).
For example, one can use produced decision trees from the tree-ensemble model [9].
例えば、ツリーアンサンブルモデル[9]から生成された決定木を使うことができる。
0.82
A data-driven approach was used to perform information fusion from different sources [1], which involved the use of Gini-index extracted from Random Forests as a method to estimate feature importance.
Nevertheless, the single random model can have a noticeable variance in data mapping when there is a weak connection between features and the target variable by making the feature importance value dependent on the random seed for the model.
The Shapley Additive explanation (SHAP) [30] provides a more advanced approache for feature importance estimation because it fuses estimation from multiple models trained across many different subsets (which selected both feature-scale and index-scale) of the dataset.
These studies motivated the utilisation of the Shap Values for our feature importance ranking across three different data sets, all with different features and incident information.
In comparison with other work, the research proposed in our paper On the future application of our research: comes not only with a significant prediction capability for all types of incident data sets with various features, but it can be further extended for solving the route scheduling problem within traffic simulation modelling, which will incorporate the adaptation of agents to occurring traffic incidents.
Apart from analysing the effects of traffic control measures [21], it is possible to analyse the effect of additional information such as the predicted incident duration, which can be performed both for scheduling and online rescheduling of dynamic agent re-routing.
Furthermore, simulation can be performed with and without such information to estimate the possible benefits of the incident duration prediction modelling within the traffic system.
Also, using an online rescheduling procedure requires the simulation to
また、オンライン再スケジュール手順を使用するには、シミュレーションが必要となる。
0.54
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 4 of 27
27ページ。
0.52
英語(論文から抽出)
日本語訳
スコア
be performed at the level of dynamic agents within a micro-simulation model, which could benefit from new re-routing schemes when traffic disruptions occur along the route.
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
2. DATA SOURCES In order to test the efficiency of the proposed bi-level framework, we have used three different data sets from two different countries: Australia and U.S.A.
The three data sets represent incident logs from an arterial road suburb in Sydney, a motorway in Sydney, Australia, and a road area from San Francisco, U.S.A.
The three data sets are represented in Fig 1 and are detailed as follows.
3つのデータセットは図1で表され、以下に詳述する。
0.80
Figure 1: Data profiling for all data sets in our study: Victoria Rd (A) - a) network mapping, d) ecdf g) distribution plot; M7 motorway (M) - b) network mapping, e) ecdf h) distribution plot; San Francisco (SF) - c) network mapping, f) ecdf i) distribution plot.
図1: 研究におけるすべてのデータセットのデータプロファイリング: Victoria Rd (A) - a) network mapping, d) ecdf g) distribution plot; M7 motorway (M) - b) network mapping, e) ecdf h) distribution plot; San Francisco (SF) - c) network mapping, f) ecdf i) distribution plot。
0.87
Victoria Rd - arterial network, Sydney: The first data set (dataset AR) contains one-year incident logs from the Victoria arterial road from Sydney, Australia (in 2017) (see Table 1 for a summary of features).
Victoria Rd - アーテリアル・ネットワーク - シドニー: 最初のデータセット(データセットAR)には、オーストラリアのシドニー(2017年)からのビクトリア・アーテリアル・ロードからの1年間のインシデントログが含まれている。 訳抜け防止モード: victoria rd - arterial network, sydney : the first data set (dataset ar )には、オーストラリア、シドニーのvictoria arterial road(2017年)から1年分のインシデントログが含まれている。 機能の概要は表1を参照してください)。
0.76
It contains information on 5,134 traffic incidents with different incident types (e g hazards, breakdowns, accidents) and subtypes (e g work zone, accident with truck).
Our current study focuses on 574 “Accidents” since these induce the longest clearance time in the current subnetwork according to the traffic management centre (TMC).
Traffic ’Accidents’ have a mean duration of 44.59 minutes and a maximum of 719 minutes.
Accidents’の平均所要時間は44.59分、最大所要時間は719分である。
0.59
Weather data represented as average daily temperature (in Celsius) and precipitation rate (in millimetres) are obtained from the Observatory Hill station in Northern Sydney, which is the closeststationtothea nalysisarea.
} + {1, 2, 3, 4, \,\,\,\,} + e, w, n, s, e-w, n-s, 1/ both + {1, 2, 3, 4, s} + e, w, n, 訳抜け防止モード: } + { 1, 2, 3, 4, , e, w, n, s, s である。 e - w, n - s, 1 / both + { ࢧ , , }
0.85
AR M Values Variable + Location + Hour of day Peak Hour + + Day of the week + Weekend Month of the Year + + Incident Subtype + Affected lanes + Direction Incident Source + Unplanned + Average Temperature + + Rainfall + Public holidays Sector ID + + TZName + Section ID + Section Speed + Section Lanes Section class + + Street ID + Intersection ID + Distance from CBD Section Capacity + Table 1 Traffic incident features for Sydney Arterial roads (AR) and M7 motorway (M).
AR M Values Variable + Location + Hour of day Peak Hour + + Day of the week + Weekend Month of the Year + + Incident Subtype + Affected lanes + Direction Incident Source + Unplanned + Average Temperature + + Rainfall + Public holidays Sector ID + + TZName + Section ID + Section Speed + Section Lanes Section class + + Street ID + Intersection ID + Distance from CBD Section Capacity + Table 1 Traffic incident features for Sydney Arterial road (AR) and M7 motorway (M)。 訳抜け防止モード: ar m value variable + location + hour of day peak hour + day of the week (英語) +1年の週末月+インシデントサブタイプ+影響を受けた車線+方向インシデントソース + 計画外 + 平均気温 + + 降雨 + 休日セクタ id + + tzname + section id + section speed + section lanes section class + + street id + 交差点id + cbd区間からの距離 + テーブル1 シドニー幹線道路(ar)の交通インシデント機能 m7 (m7 motorway) の略。
0.78
Description , in GDA Lambert coordinatesValue is 1 if hour belongs to {7 … 9} or {16 … 18} hour interval Weekday numbers from Monday to Friday Value is 1 for Saturday and 0 for SundayField indicating cause of incident Number of lanes affected by the accident Affected traffic direction Source of the incident report Value is 1 if incident is planned, 0 otherwise Average temperature for the time of the incident Rainfall for the time of the incident Value is 1 if days is a public holiday Defined by TMC Traffic zone name as Defined by the Bureau of Transport Statistics Road section on which the incident occurred Section speed limit Number of section lanes As defined by TMC As defined by TMC As defined by TMC distance between the traffic incident and the city CBD Maximum flow capacity of the section
Description , in GDA Lambert coordinatesValue is 1 if hour belongs to {7 … 9} or {16 … 18} hour interval Weekday numbers from Monday to Friday Value is 1 for Saturday and 0 for SundayField indicating cause of incident Number of lanes affected by the accident Affected traffic direction Source of the incident report Value is 1 if incident is planned, 0 otherwise Average temperature for the time of the incident Rainfall for the time of the incident Value is 1 if days is a public holiday Defined by TMC Traffic zone name as Defined by the Bureau of Transport Statistics Road section on which the incident occurred Section speed limit Number of section lanes As defined by TMC As defined by TMC As defined by TMC distance between the traffic incident and the city CBD Maximum flow capacity of the section 訳抜け防止モード: GDA Lambert coordinatesValue は時間が { 7 ... 9 } に属する場合 1 である。 または16...18 } 月曜日から金曜日までの平日時間間隔の数字は土曜日の1である 事故発生時の交通方向に影響した車線数 事故発生時の報告値の出典は,事故発生時の1である。 0 事故発生時の平均降水温度は、事故発生時の平均降水量は、1 日が公共の休日であれば1 である 交通統計局が定める TMC 交通帯の名称で定める 事故発生時の区間速度制限数 TMC の区間速度制限数 TMC で定める 交通事故間のTMC 距離で定める。 この区間のCBD最大流量容量は
the official area where the accident occurred (as defined by the Bureau of Transport and Statistics), and supplementary information such as section capacity, section speed limit, and the number of lanes.
事故が発生した公式な地域(運輸統計局の定義)と、区間容量、区間速度制限、車線数などの追加情報。
0.50
These features are available for all road sections in the Victoria sub-network, and they were extracted from the official traffic simulation model of the Victoria network, developed in Aimsun and previously used by the authors for conducting an incident impact analysis and traffic prediction [41].
M7 motorway, Sydney: Theseconddatasetisam otorwaydataset(datas etM),consistingof7,1 94trafficaccidents along the M7 motorway in Sydney, Australia, during the same year 2017.
The mean duration of motorway accidents is 47.2 minutes, with a maximum duration of - 598 minutes.
高速道路事故の平均所要時間は47.2分で、最大所要時間は598分である。
0.66
This data set also includes weather data (average daily temperature and precipitation).
このデータセットには、天気データ(平均気温と降水量)も含まれている。
0.74
This set of features is similar to the arterial roads data set AR without the geometric features of the lanes (section lanes, section class), intersection ID, distance from the central business district (CBD); this is due to the complexity of mapping of a traffic incident to a correct location along the motorway.
We make the observation that for both Data set AR and M, the traffic flow information of the affected road sections was omitted for this study since we found previously no significant improvement to the prediction accuracy [33].
San-Francisco road network: The last data set is from San-Francisco, U.S.A. (data set SF) and includes information on accidents from all types of roads in the city.
It is part of a more considerable initiative entitled "A Countrywide Traffic Accident Dataset", recently released in 2021, which contains 4.2 million accident reports collected for almost 4.5 years since March 2016 [34].
The SF data set contains 49 features describing the accidents as detailed in [34] (due to a large table of feature, we refer the reader to the cited paper and not duplicate this feature information).
This study focuses on the "accident” type duration prediction as being the most severe one.
本研究は,「アクシデント」型持続時間の予測を最も重篤な予測として検討する。
0.74
We extract and use 8,754 accident records related to the San-Francisco area.
サンフランシスコ地区に関する8,754件の事故記録を抽出し,利用した。
0.55
As observed from Fig 1
図1からわかるように
0.75
c), a significant part of the accidents occurred along the “US-101” highway and “John F. Foran” Freeway.
c) 事故のかなりの部分は「us-101」高速道路と「ジョン・f・フォーラン」高速道路沿いで発生した。
0.68
Accidents have a mean duration of 100 minutes and a max duration of 2,715 minutes.
事故時間は平均100分、最大2,715分である。
0.54
Data sets profiling: Each data set undergoes a profiling procedure by investigating the empirical cumulative distribution functions (ECDF as plotted in Fig 1
d), e), f)) and their equivalent log-space distribution plots (as represented in Fig 1
d) e)。 f)とその等価な対数空間分布プロット(図1に示すように)
0.50
g), h), i).
g)。 h) 私)。
0.42
The ECDF function presents thresholds of data behaviour (marked in red) across each data set which reveal indicative thresholds of a different behaviour around specific incident duration (see for example Fig 1d) versus Fig. 1f) where the first inflection point is around 40min for data set AR versus 100min for data set SF.
Findings reveal significant anomalies representative of each data set.
発見は、各データセットを表す重要な異常を明らかにする。
0.60
For example, data set AR contains a reduced amount of traffic
例えば、データセットarは、トラフィックの量が少なくなります。
0.69
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 6 of 27
27頁6頁。
0.74
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
accidents with small incident duration (zero or less than 4 min), data set M contains an increased number of accidents with zero or one-minute duration, while the data set SF despite not presenting any short term incident duration below 17 minutes, it contains a large number of incidents of 29 and 360 minutes which raises the question of either these are outliers in the data set or simply reveal a road network behaviour in terms of incident management in the area; this also might indicate that it will present unique behaviour under the prediction framework and that different processing techniques needs to be applied for this data set.
accidents with small incident duration (zero or less than 4 min), data set M contains an increased number of accidents with zero or one-minute duration, while the data set SF despite not presenting any short term incident duration below 17 minutes, it contains a large number of incidents of 29 and 360 minutes which raises the question of either these are outliers in the data set or simply reveal a road network behaviour in terms of incident management in the area; this also might indicate that it will present unique behaviour under the prediction framework and that different processing techniques needs to be applied for this data set. 訳抜け防止モード: 事故発生期間が小さい事故(0または4分未満) データセットMは、ゼロまたは1分間の事故数の増加を含む。 一方、データセットは、短期的なインシデント期間が17分未満であるにもかかわらず、SFに設定されている。 29分から360分の事件が 多数含まれていて 疑問が浮かび上がっています これらはデータセットの外れ値か、あるいはエリア内のインシデント管理の観点からのロードネットワークの振る舞いを単に明らかにするものです。 予測の枠組みの下で 独特な行動を示します このデータセットに様々な処理技術を適用する必要があります
0.78
We also observe that the incident duration is long-tail distributed, which is likely to pose difficulties for prediction algorithms due to the presence of extreme values (either small or large).
3. METHODOLOGY Figure 2: The proposed bi-level modelling framework for traffic incident duration prediction.
3.方法論 図2: トラフィックインシデント時間予測のための双方向モデリングフレームワークの提案。
0.57
Clearing accidents in a short time represents a high priority task for traffic management centres (TMC) worldwide.
事故を短時間でクリアすることは、世界中の交通管理センター(TMC)にとって高い優先度の課題である。
0.61
For example, in New South Wales, Australia, the target clearance time for traffic incidents is 45 minutes, but this limit might differ in other countries.
Therefore, in the rest of this paper, we will refer to this threshold as “incident clearance threshold ()” and any incidents cleared before this threshold (e g < 45 min) as "short-term"; incidents which lasted more than the clearance threshold (e g >= 45 min) will be referred to as “long-term” traffic incidents.
A unique threshold will be derived for each dataset and will be discussed further in this paper.
各データセットに対して独自のしきい値が導出され、この論文でさらに議論される。
0.69
The methodology of this paper has its origins in our previous work applied only for arterial roads [33], which we further extend and improve via the joint optimisation and outlier detection enhancements of the prediction framework.
The methodology we propose for modelling the incident duration prediction problem is using a bi-level prediction framework combining a classification and regression modelling, as represented in Fig 2.
This approach has been constructed by considering the real-time operational goals of TMC and providing short duration prediction into the life-cycle of the incident management.
Basedontheinitialtra fficincidentinformation , thefirststepisthedeployme ntofafastclassificationmethodwhich would only predict whether the accident will be either short-term (subset A) or long-term (subset B) - see incoming data set from Fig 2 where the data is split in two parts based on ).
basedontheinitialtra fficiidentinformatio n, thefirststepisthedep loymentofafastclassi ficationmethod 事故が短期的(サブセットa)か長期的(サブセットb)かを予測するのみである。
0.50
Next, we test various duration thresholds and select the optimal , which provides a good class balance and classification performance for each dataset.
次に、様々な持続時間閾値をテストし、各データセットに対して優れたクラスバランスと分類性能を提供する最適な s を選択できる。
0.81
Once the Grigorev et al : Preprint submitted to Elsevier
一度 Grigorev et al : Elsevierに提出されたプレプリント
0.72
Page 7 of 27 Features:Hour of dayIncident SubtypeIncident Reporting SourceAffected Lanes…LDO removalML Classification modelVarying thresholdABAllLDO removalML RegressionmodelABAll ABAllRegression scenariosTrainTestBe st duration split thresholdBest ML model for each scenarioA = short-term incidentsB = long-term incidentsAll = All traffic incidentsBest LDO removal threshold
Due to the main challenge of this task, we further propose an outlier removal approach (ORM) detailed in Section 3.6 and our innovative Intra/Extra Joint Optimisation modelling coupled with several machine learning models trained via a hyper parameter tuning (we denote this approach as IEO-ML and is further detailed in Section 3.7).
The boosted regression framework is finally applied under several regression scenarios (see section Section 3.5), which are constructed to evaluate the framework capability to predict under all possible situations.
For example, when we only have a subset A available (short-term incidents) but the TMC would like to predict long term incident (subset B) we denote this as a Scenario A-to-B (training the models on subset A and making predictions on subset B); all scenarios are constructed based on the assumptions that the framework needs to be robust in order to predict any type of incident durations, under all possible data shortage or lack of information availability.
In the following subsection, we further provide the mathematical and theoretical modelling of each of the steps described above.
以下の節では、上記の各ステップの数学的および理論的モデリングをさらに提供する。
0.71
3.1. Classification and regression definitions
3.1. 分類と回帰の定義
0.39
Using all available data sets and the incident information, we first denote the matrix of traffic incident features as: (1) where is the total number of traffic incident records used in our modelling and is the total number of features characterising the incident (severity, number of lanes, type, neighbourhood, etc.) according to each specific data set (see examples provided in Table 1).
For the incident duration classification problem, we denote the incident duration classification vector as: ]ࢠ1..
インシデント期間分類問題については、インシデント期間分類ベクトルを次のように示す。
0.63
= [
= [
0.42
(2) where N is the duration of the traffic incident (in minutes), is the vector of binary values taking values in {0, 1}, and is the vector of integer values for the multi-class classification problem definition, taking values in {0, 1, 2}.
More specifically, in the first stage we create a binary classification modelling with the purpose of identifying short versus long-term incident duration, split by the incident clearance threshold .
= 0 = 1 if ࢠ [ 1 = 2 where 1 and 2 takeseveralvaluesasf urtherdetailedinSect ion4.3.
= 0 = 1 if ࢠ [ 1 = 2 where 1 and 2 takeseveralvaluesasf urtherdetailedinSect ion4.3.
0.34
Thebinaryclassificationapproachimplem ented with a computation time constraint for operational purposes (more details on computation time comparison can be found in Appendix B).
操作目的の計算時間制約で実装されたbinaryclassification approachimplemented (計算時間の比較の詳細は appendix b で確認できる)。
0.63
The regression problem is further structured with a more fine-grained incident duration prediction in mind.
回帰問題は、よりきめ細かな入射時間予測を念頭に、さらに構造化される。
0.65
The main objective motivating the regression modelling consists in more precise information regarding the duration of incidents which can fall into a wide class varying, for example, between and 0 and 30 minutes (for these cases, the traffic centres require more detailed precision to the minute level as a 5-min accident has different handling procedures than more severe accidents of 30min for example).
The regression models go via an extensive cross-validation procedure with hyper-parameter tuning, with the test of outlier removal using a joint optimisation approach as further detailed in the Section 3.3-Section 3.6-Section 3.7.
3.2. Selection of baseline machine learning models We have tested and deployed several ML models for both the classification and regression problems for this current work, which have served as baseline models to compare our proposed optimisation approach.
These are listed as follows: a) gradient boosting decision trees - GBDT [13] which rely on training a sequence of models, where each model is added consequently to reduce the residuals of prior models;
以下に列挙する。 a) 勾配強化決定木 - gbdt [13] 一連のモデルのトレーニングに依存している。各モデルの追加により、先行モデルの残余が減少する。
0.54
b) extreme gradient decision trees - XGBoost [8] which finds the split values by enumerating over all the possible splits on all the features (exhaustive search) and contains a regularisation parameter in the objective function;
b) 極端な勾配決定木-xgboost [8] は、すべての特徴(探索)上の可能な全ての分割を列挙し、目的関数に正規化パラメータを含むことにより、分割値を見つける。
0.84
c) random forests - RF [5] which applies a bootstrap aggregation (bagging, which consists of training models on randomly selected subsets of data) and uses the average (or majority of votes) of multiple decision trees in order to reduce the sensitivity of a single tree model to noise in the data;
c) ランダム・フォレスト - RF[5]は、ブートストラップ・アグリゲーション(データのランダムに選択されたサブセットのトレーニングモデルからなるバッグ)を適用し、複数の決定木の平均(または過半数の投票)を使用して、データのノイズに対する単一ツリーモデルの感度を低下させる。
0.82
d) k-nearest neighbours - kNN [12] which uses for the prediction on data points the majority of votes or the average from k closest neighbouring data points from the training set (based on a distance metric);
e) linear Regressions - LR - a standard predictor using linear equations to model the relation between the features and the regression variable;
e) 線形回帰 - LR - 線形方程式を用いて特徴と回帰変数の関係をモデル化する標準予測器。
0.79
f) light gradient boosted machines - LightGBM [20] which applies gradient boosting to treebased models; it also uses a Gradient-based One-Side Sampling (GOSS) and excludes data points with small residuals for finding split value.
The models have been used for both classification and regression problems (except logistic regression applied to classification only and linear regression to regression problem only).
They are the main base on which we further enhance and develop our outlier and joint optimisation prediction algorithm used in the current bi-level incident duration prediction framework.
3.3. Hyper-parameter tuning through randomised search Most machine learning algorithms have a set of hyper-parameters related to the internal design of the algorithm that cannot be fitted from the training data.
Both GBDT and XGBoost present dozens of hyper-parameters, out of which the most important ones are max_ depth, learning_rate, min_ child_weight, gamma, subsample, colsample_ bytree and scale_ pos_ weight [24].
The hyper-parameters are usually tuned through randomised search and crossvalidation.
ハイパーパラメータは通常、ランダムな探索とクロスバリデーションによって調整される。
0.56
The most extensive search technique is the grid-search, in which several equally spaced points are chosen in the most credible interval for each parameter, and for each point combination, a model is fitted and tested through cross-validation.
In this work, we employ a Randomised-Search [4] which selects a (small) number of hyper-parameter configurations randomly to use through cross-validation.
Forexample, onFig.3, (Arterialroads, Sydney), wesee that XGBoostisthebestperf ormingmodelstarting from 120iterations, anditisalreadycloset ooptimumstarting from 175 iterations。
We use F1-score as a target metric for classification experiments as F1 represents the balance between Precision and Recall, and is in general a better performance metric to use when we are facing an uneven class distribution rather than interpreting the Accuracy results which take into consideration the total number of both false positive, false negative together with the true positives and true negatives; therefore for uneven class balances (especially the ones with fewer incident records), one should rely less on Precision and Accuracy metrics.
We use F1-score as a target metric for classification experiments as F1 represents the balance between Precision and Recall, and is in general a better performance metric to use when we are facing an uneven class distribution rather than interpreting the Accuracy results which take into consideration the total number of both false positive, false negative together with the true positives and true negatives; therefore for uneven class balances (especially the ones with fewer incident records), one should rely less on Precision and Accuracy metrics. 訳抜け防止モード: F1は精度とリコールのバランスを表すため,F1スコアを分類実験の目標基準として用いる。 そして、一般に、総数の考慮に入れた正確性の結果を解釈するよりも、不均一なクラス分布に直面しているときに使用するより良いパフォーマンス指標です。 偽陽性、偽陰性、真陽性、真陰性の両方のため、不均一なクラスバランス(特にインシデントレコードが少ないもの)に対して。 精度と精度の指標に頼らなければならない。
0.77
To evaluate the regression models we use the mean absolute percentage error defined as:
回帰モデルを評価するために、私たちは平均絶対パーセンテージ誤差を次のように定義します。
0.60
. =1 = 1
. =1 = 1
0.38
(10) where are the actual values and - the predicted values, - number of samples.
(10) 実際の値と、予測値と、予測値と、サンプル数とがある場合。
0.69
Other metrics have been calculated but we will keep them concise due to large amount of experiments to show.
その他の指標は計算されていますが、大量の実験によって簡潔に保たれます。
0.63
3.5. Regression scenarios definition The main objective of the bi-level framework is that the regression accuracy can benefit from different setups for different data subsets.
For an even better accuracy compared to the classification problems, we are further developing more complex regression models that can provide incident duration prediction at minute-level accuracy.
This is the second step of the bi-level prediction framework to be applied when more precision is needed at the minute level regarding the incident duration length.
Due to the long tail distribution of incident duration and the class imbalance problem previously identified, we need to design and construct various regression models capable of learning from various types of data sets to make accurate predictions.
However, with limited information (small data set size), the prediction results can be skewed.
しかし、限られた情報(小さなデータセットサイズ)では、予測結果は歪むことができる。
0.80
This is the primary motivation that led to the construction of several scenarios of model training, validation and prediction that can be applied under both complete or incomplete data sets from traffic centres.
By using the classification thresholds identified previously, we split the traffic incident data set into two subsets: subset A (with duration below threshold ) and subset B (with duration above threshold ) as previously defined at the beginning of Section 3.
We further contract several scenarios of subset combinations for training-validation- testing detailed with the aim of extrapolating the model performance: Scenario All-All: we use the entire data set and apply several regression models using a 10-fold cross-validation approach and different hyper-parameter search methods.
This approach will show us the general performance across various methods.
このアプローチは、様々なメソッドにわたる一般的なパフォーマンスを示します。
0.71
Scenario A-to-B: we use subset A (short-term incidents) for training the regression models and evaluate the prediction on subset B (long-term incidents).
In this scenario, we will analyse methods to extrapolate to higher values of the target variable.
このシナリオでは、ターゲット変数のより高い値に外挿するメソッドを分析します。
0.70
Scenario A-to-A: we use subset A for training the regression models and predict on subset A. In this scenario, we will analyse the prediction ability of methods with long-term incidents excluded (which includes values from the tail of the incident duration distribution).
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
Scenario B-to-A: we use subset B for training the regression models and predict on subset A. In this scenario, we will analyse methods to extrapolate to lower values of the target variable.
シナリオB-to-A: 回帰モデルをトレーニングし、サブセットAを予測するためにサブセットBを使用します。 訳抜け防止モード: シナリオb - to - a : 回帰モデルのトレーニングにサブセットbを使う そして、サブセットaで予測します。 このシナリオでは、ターゲット変数の値を外挿するメソッドを分析します。
0.74
Scenario B-to-B: we use subset B for training the regression models and predict on subset B. In this scenario, we will analyse the prediction ability on long-term incidents.
Scenario All-to-A: we use all the data for training the regression models and predict on each fold within subset A. In this scenario, we will analyse the effect of adding long-term incidents data into model training for predicting short-term incidents duration.
シナリオ all-to-a: 回帰モデルのトレーニングにすべてのデータを使用し、サブセットa内の各フォールドを予測します。このシナリオでは、モデルトレーニングに長期インシデントデータを追加することで、短期インシデント期間を予測する効果を分析します。 訳抜け防止モード: シナリオ all - to - a : 回帰モデルのトレーニングにすべてのデータを使用する 集合 a 内の各折りたたみを予測します このシナリオでは 短期インシデントを予測するためのモデルトレーニングに長期インシデントデータを追加する効果を分析する。
0.88
Scenario All-to-B: we use all the data for training the regression models and predict on each fold within subset B. In this scenario, we will analyse the effect of adding short-term incidents data into the model training to predict long-term incident duration.
シナリオ all-to-b: 回帰モデルのトレーニングにすべてのデータを使用し、サブセットb内の各フォールドを予測します。このシナリオでは、モデルトレーニングに短期インシデントデータを追加することで、長期インシデント期間を予測する効果を分析します。 訳抜け防止モード: シナリオ all - to - b: 回帰モデルのトレーニングにすべてのデータを使用する 集合 b 内の各折りたたみを予測します このシナリオでは 短期インシデントデータをモデルトレーニングに追加する効果について分析する 長期のインシデント期間を予測する。
0.88
3.6. Outlier removal methods (ORM) As previously discussed in Section 2-Fig.
3.6. 外乱除去法(ORM) 第2節で述べたとおり。
0.50
2 during the data profiling, we observed that the traffic incident logs contain outliers appearing as either minor incidents, rare traffic incidents with highly long duration and/or as errors in incident reports.
Ifthedatapointisanou tlier, itwill have a small tree depth (e g data point gets quickly separated from the rest by selecting values in just a few features).
Tree depth is then averaged between all the “isolation” trees and considered an anomaly score (e g if the average tree depth for a point is 1.3, the point is easily separable after a small number of splits).
LocalOutlierFactor (LOF) [6] is another outlier removal method, which estimates the anomaly score from local deviation of density within k-nearest neighbourhood.
LOF relies on the calculation of a local reachability density (LRD), which represents the inverse of the average reachability distance (RD) of neighbouring data points from the selected data point.
LOF method relies on the fact that outliers belong to the area where the density of data points is low, while regular data points belong to the high-density area.
To summarise, the above outlier removal procedures are applied in conjunction with the proposed optimisation framework and regression models and show a significant improvement in prediction accuracy as further detailed in Section 5.3.
3.7. Intra/Extra Joint Optimisation for ML regression prediction (IEO-ML) This section presents our novel enhancements of ML regression models by constructing an intra/extra optimisation technique to jointly optimise the hyper-parameters of the regression models together with previous outlier optimisation methods.
In the rest of the paper, we denote this approach as IEO-ML, where ML is one of the regression models previously described (GBDT, XGBoost, RF, kNN, LR, LGBM).
1) the traffic incident data is prone to errors during the data collection, which is attributed to human factors (e g presence of incidents with 0 and 1-minute durations, for example),
2) an outlier removal performance cannot be assessed on the new dataset with no marking for outliers; thus, we can assess outlier removal performance by looking at model performance with outlier removal applied, use joint outlier removal and modelling to assess the outlier removal performance metrics,
3) both the outlier removal method and models have hyper-parameters forming a single hyper-parameters space,
3) 降圧除去法とモデルの両方が1つの超パラメータ空間を形成する超パラメータを有する。
0.69
4) we assume that the outlier removal can be performed either inside (Intra - see Fig 5) or outside (Extra - see Fig 4) of the cross-validation cycle, and we evaluate the effect of such an approach on the model performance,
5) Intra joint optimisation can provide a more effective outlier removal since common hyper-parameters will be found for different data subsets, which allows ORM to be adapted to different possible combinations of incidents in case of the model deployment and prediction on the newly acquired incident log.
Overall we want to compare and observe the impact of each technique on the accuracy of regression models and detect the best combination of Intra/Extra joint optimisation and various ML regression models.
Our approach explores the following combinations of ML models in selected working base (decimal or logarithm) with outlier removal and intra/extra joint optimisation; for example, we denote
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
Figure 4: Extra joint optimisation schema for the EO-ML algorithm.
図4: EO-MLアルゴリズムのための余分な共同最適化スキーマ。
0.69
Figure 5: Intra joint optimisation schema for the IO-ML algorithm.
図5: IO-MLアルゴリズムの関節内最適化スキーマ。
0.80
as iLOF-LT-MLmodel a “joint optimisation of any available baseline ML model with LOF in a log-transform base within a cross-validation cycle (an intra optimisation)”.
As an observation, ORM has specific hyper-parameters but one parameter in common - the percentage of removed samples, which we assume to be outliers (ORperc).
Page 12 of 27 ORMAll traffic incidentsRegressionm odelORMRegressionmod elBest combinationof ORM and model hyper-parametersCros s-validation scoreTest scoreTrain setCross-validationc ycleTest setRandom SearchcycleGenerate hyper-parameters for the model and ORMFeatures:Hour of dayIncident SubtypeIncident Reporting SourceAffected Lanes…ORMRegressionmodelOR MRegressionmodelBest combinationof ORM and model hyper-parametersCros s-validation scoreTest scoreTrain setCross-validationc ycleTest setRandom SearchcycleGenerate hyper-parameters for the model and ORMAll traffic incidentsFeatures:Ho ur of dayIncident SubtypeIncident Reporting SourceAffected Lanes…
27ページ。 ORMAll traffic incidentsRegressionm odelORMRegressionmod elBest combinationof ORM and model hyper-parametersCros s-validation scoreTest scoreTrain setCross-validationc ycleTest setRandom SearchcycleGenerate hyper-parameters for the model and ORMFeatures:Hour of dayIncident SubtypeIncident Reporting SourceAffected Lanes…ORMRegressionmodelOR MRegressionmodelBest combinationof ORM and model hyper-parametersCros s-validation scoreTest scoreTrain setCross-validationc ycleTest setRandom SearchcycleGenerate hyper-parameters for the model and ORMAll traffic incidentsFeatures:Ho ur of dayIncident SubtypeIncident Reporting SourceAffected Lanes… 訳抜け防止モード: 27ページ。 ORMAllトラフィックインシデント RegressionmodelORMRe gressionmodelORMORMとモデルハイパー-パラメータの最も良い組み合わせ - Cross - Validation scoreTrain setCross - validationcycle setRandom SearchcycleGenerate hyper - parameters for the model and ORMFeatures : Hour of day Incident Subtype Incident Reporting SourceAffected Lanes ... ORMRegressionmodelOR MRegressionmodelORMの最も良い組み合わせは、ORMである。 And model hyper - parametersCross - validation scoreTrain setCross - validationcyclesetRa ndom SearchcycleGenerate hyper - parameters for the model and ORMAll traffic incidentsFeatures : Hour of day Incident Subtype Incident Reporting SourceAffected Lanes.
0.52
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
of times which is equal to the number of folds.
折りたたみの数と等しい時間です
0.35
Thus, ORperc has values in {0, 1 … 5%} for EJO, in {0, 1ࢧ5, … , 5ࢧ5} for IJO to ensure a comparable amount of removed samples from both approaches.
Results for all combinations of the proposed approach inside the incident duration prediction framework are further provided in Section 5.3 for eLOF-ML models, iLOF-ML, iIF-ML, eIF-ML (e g eIF-ML is a “joint ML optimisation using IF optimised outside (e) of the cross-validation cycle”).
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
Data: Traffic incident reports (feature vector , duration vector ) Input: HPSm (Hyper-Parameter Space for Model), ORM: Outlier Removal Method, HPSor: Hyper Parameter Space for ORM, Model: ML regression model ࢠ { , , , , , }, Iters: Number Of Iterations (number of random search steps for hyper-parameter optimisation), Folds: number of folds for cross-validation, sample: function for random sampling from the hyper-parameter space, FoldIndexes: function to get sample indexes for training folds and test fold, extra: boolean variable stating the use of extra joint optimisation, intra: boolean variable stating the use of intra joint optimisation, split: function to split data set into two parts - train/test and validation parts Output: Predicted duration vector , , , = (, ); = [] ; = [] for ࢎ 1..
Data: Traffic incident reports (feature vector , duration vector ) Input: HPSm (Hyper-Parameter Space for Model), ORM: Outlier Removal Method, HPSor: Hyper Parameter Space for ORM, Model: ML regression model ࢠ { , , , , , }, Iters: Number Of Iterations (number of random search steps for hyper-parameter optimisation), Folds: number of folds for cross-validation, sample: function for random sampling from the hyper-parameter space, FoldIndexes: function to get sample indexes for training folds and test fold, extra: boolean variable stating the use of extra joint optimisation, intra: boolean variable stating the use of intra joint optimisation, split: function to split data set into two parts - train/test and validation parts Output: Predicted duration vector , , , = (, ); = [] ; = [] for ࢎ 1.. 訳抜け防止モード: データ : 交通インシデントレポート(特徴ベクトル, 期間ベクトル, 期間ベクトル)入力 : HPSm(Hyper-パラメータ空間・モデル) ORM : 外乱除去法HPSor : ORMのためのハイパーパラメータ空間 モデル : ML回帰モデル > { >, >, >, >, > Iters : Number of Iterations(ハイパーパラメータ最適化のためのランダム探索ステップ数)、Folds : Cross - Validation, sample : function for random sample from the hyper - parameter space, FoldIndexes : テストフォールドとテストフォールドのトレーニングのためのサンプルインデックスを取得する関数。 内部 : boolean変数 関節内最適化の使用,分割 : データセットを2つの部分に分割する関数。 , , = (, ) ; = [ ] ; = [ ] for ࢎ 1 ..
0.69
do // temporary cross-validation prediction vector
‐--- //仮交差評価予測ベクトル
0.41
ࢎ ( ) ࢎ ( ) = [] ; = [] ; = 0 ; if extra then for ࢎ 1..
ࢎ ( ) ࢎ ( ) = [] ; = [] ; = 0 ; if extra then for ࢎ 1..
Then we perform an n-fold cross-validation procedure, where we split data set into training and testing parts (by preserving ratio between them at F-1:1, where F is the number of folds) according to sequentially generated indexes (e g in case of 500 data points, fold 0 will represent indexes from 0 to 100 for the testing set, rest of the folds - indexes from 100 to 500 for the training set, fold 1 - 100-200 for the testing set, rest - 0-100 and 200-500 for the training set, etc).
Then, if intra joint optimisation is selected within the cross-validation cycle, we perform outlier removal with sampled hyper-parameters using only the train subset within each train-test split.
Since we are selecting test folds in order and making predictions on them, the predicted duration vector will be composed of prediction results composed of these folds.
So, first, we collect the resulting metric together with hyper-parameters, actual and predicted labels.
まず、結果のメトリクスをハイパーパラメータ、実際のラベル、予測ラベルとともに収集する。
0.57
To collect data we use hash-array, which is represented as an array, where each element can be addressed by name and not by index as for conventional array.
Then we perform the sorting procedure, which will order solutions according to the resulting metric, where we select the best combination of hyper-parameters.
Furthermore, finally, we obtain the predicted duration vector by filtering data using the ORM method, training model on the train/test part and making predictions on the validation part.
4. Incident classification results This section details the results of the first layer of the bi-level prediction framework related to the classification prediction findings, either via a standard binary classification with varying threshold analysis or via a multi-class classification enhanced by outlier removal procedures.
4.1. Binary incident classification results using varying split thresholds The first classification problem that we address is to predict whether an incident duration will be lower or greater than a selected threshold (we classify short-term versus long-term traffic incidents), which can then be used to supply the initial assessment needs of the traffic management centre (TMC) under fast decision times.
For example, an operational clearance threshold for the Sydney TMC has been currently established at 45min based on previous operational field experience; however, choosing a fixed threshold for classification can have a significant impact on the results of any prediction algorithm and is highly dependent on the incident duration distribution chart (as represented in Fig Fig. 1-g, h, i).
Fig 2 showcases the data split for the binary classification problem where the threshold (dashed red line) is varying according to the two set-ups mentioned above: every 5 minutes ( ࢠ {20, 25, … , 70}).
We name as Subset A all incident duration records which are lower or equal to , (if ࣘ ), and as Subset B all the incident duration records which are higher than (if > ).
私たちは、サブセットAと名づける:全てのインシデント持続時間記録は、より低いか等しいか、(もし s )、そして、サブセットBとして、すべてのインシデント持続時間記録は、(if s > s )より高い。
0.61
Based on the variation of , the size of Subsets A and B will have an impact on the prediction algorithms and this impact is further quantified.
この変化に基づいて、サブセットAとBのサイズは予測アルゴリズムに影響を与え、この影響はさらに定量化される。 訳抜け防止モード: s の変動に基づいて、A と B のサブセットのサイズが予測アルゴリズムに影響を及ぼす。 この影響はさらに定量化されています
0.83
Figure 6: Incident duration classification using varying thresholds for
図6: 異なるしきい値を用いたインシデント時間分類
0.89
a) data set AR
a)データセットAR
0.73
b) data set M c) data set SF.
b)データセットM c)データセットSF。
0.76
The red percentage above each set of ML results indicate the percentage split of Subset A and B for that particular .
mlの結果のそれぞれの集合の上の赤い割合は、その特定の s に対する部分集合 a と b の比率である。
0.65
The results of the binary classification approach of incident durations using a varying split threshold are detailed in Fig. 6 (for a 5-minutes frequency split) across all data sets.
More specifically, Fig 6 presents the F1 results obtained for
具体的には、fig 6 は f1 の結果を示す。
0.67
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 15 of 27 XGBoostXGBoostXGBoos ta)b)c)
27頁15頁。 XGBoostXGBoostXGBoos ta)b)c)
0.58
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
each ML model that we have developed (XGBoost, LR, LGBM, GBDT, kNN, RF); we observe that other performance metrics have been calculated such as Accuracy, Precision and Recall and these are provided in the Appendix A).
For example, Fig 6a) showcases the classification results for data set AR in which the blue bar represents the F1-result of the XGBoost classifier (F1=0.28) when the data set has been split in Subset A containing incidents with a duration less than 20min (32% of all incident records fall in this subset) and Subset B containing incidents with duration higher than 20min (the rest of 68% of incident records).
Therefore, the percentage numbers written in red above each ML result represent the percentage of records lower than the threshold chosen for this experiment.
The split around = 20 is not ideal given the data imbalance (32% versus 68%) and the low F1 score; therefore further variations have been undertaken which have reported an increased 1 = 0.8 for = 45.
a) it will reduce the imbalance between classes (and thus reduce the effects of imbalanced classification, which is vital for modelling when using a small data set);
a) クラス間の不均衡を減少させる(従って、小さなデータセットを使用する場合に必要となる不均衡な分類の効果を減少させる)。
0.84
b) there is only a tiny improvement in F1-score after > 40min;
b) F1スコアは > 40min の後にわずかに改善されている。
0.73
c) it will be a reasonable split for short incidents lower in terms of field operation management.
c) フィールド運用管理の面では、短いインシデントに対して合理的な分割となる。
0.78
An exciting finding is revealed for ࢠ {20, 25}min: we record an overall lousy performance across all ML models in all data sets (F1-score less than 0.5) while some did not even take effect, such as GBDT; for this reason, we exclude from consideration any thresholds which provide an F1-score of less than 0.5.
Furthermore, we set our minimum acceptable F1-score to 0.75, and any model performing lower than this threshold will not be considered for further optimisation.
By analysing all sub-figures in Fig 6 which provide both a good F1 score and class balance, we conclude that the optimal thresholds for the binary classification problem are the following:
c) = 45min for the San Francisco network (Fig. 6c: 1 = 0.83, class balance=55%).
c)サンフランシスコ・ネットワークの45分(図6c:1 = 0.83,クラスバランス=55%)。
0.72
The other important finding is the cases when > 45min which present a significant improvement across all models on all performance metrics, with the best result being the one when Subset A incorporates all incidents lower than 70min (which represents the majority of incidents); this is easily explained by the fact that we use almost all the entire data set for training of the models.
However, the binary classification can be a rough estimate.
しかし、二項分類は大まかに見積もることができる。
0.72
If TMCs need a higher prediction precision instead of incidents less than 45min or higher (which can last up to several days), then several regression and multi-class classification models are needed to provide more precise predictions.
These will be further detailed in Sections 6 and 7.
詳細は第6条および第7条で詳述する。
0.70
We will further use the detected optimal thresholds for each data set to perform the split between subset A and B in various scenarios of the incident duration regression problem.
However, in multiple cases (e g 35, 45, 50, and 60-minute thresholds for data set AR, 25, 30, 40, 60-minute thresholds for data set M), XGBoost produces a slightly better result than other tree-based models.
For example, Fig 7a represents the LDO removal from the data set AR, up until 10min reported incident durations; by removing these outliers, we observe that the F1-score does not fall below the acceptable threshold of 0.75 until 5min (this indicates that removing all accidents reported with a duration of 0 or lower than 5min does not reduce the model performance.
Therefore, we applied an LDO removal for all traffic incidents for this data set with a duration below 5min.
そこで本研究では,5分以下で設定した全トラフィックインシデントに対してLDO除去を適用した。
0.67
For thedata set M, the effect of LDO outlier removal is more significant, as depicted in Fig 7b.
データ集合Mでは、図7bに示すように、LDO外乱除去の効果がより重要である。
0.76
This dataset contains a lot of incidents with durations of 0 and 1 minute (which represents almost 15% of the entire data set); by removing these, we observe that the highest F1-score drops down to 0.74 across all ML models, which falls below the acceptable threshold for a good prediction accuracy).
Therefore, we decide to remove only incidents with durations of 0min or 1min from this dataset.
したがって、このデータセットから0minまたは1minのインシデントのみを取り除くことにしました。
0.62
Lastly, in the case of the San-Francisco dataset, we have a completely different range of outliers since there are no incidents reported with a duration of fewer than 17 minutes (see Fig 7c).
There are multiple incidents cleared off at around 29min and 360min (as represented as well in Section 2, which can be identified as HDO.
約29分と360分(第2節にも記載されている)で複数の事件が発覚し、HDOと同一視できる。
0.63
However, by removing these HDO data points from the ML model training (representing almost 38% of all incident records), we observe a depreciation of the F1 score from 0.85 to 0.76 for XGBoost, while some models dropped to lower values below 0.7).
Finally, we observe that the outlier procedure is highly related to the specificity of the data set and the incident area location, not by making default assumptions on either LDO or HDO.
4.3. Multi-class classification While binary classification can provide fast insights in the overall incident duration, traffic incidents can have more precise duration definition and can be split (based on the histogram profiling) into short-term, mid-term, long-term.
In this case one needs to solve a multi-class classification problem which can contain 3 equally-sized classes (based on duration percentiles of almost 33% from each data set).
We use F1-macro to assess the performance of a multi-class classification, defined as the unweighted average of class-wise F1-scores:
f1-macroを用いて,クラス別f1-scoreの非重み付け平均として定義されるマルチクラス分類の性能を評価する。 訳抜け防止モード: マルチクラス分類の性能評価にf1-macroを用いた。 unweighted average of class - wise f1-scores :
0.86
F1-macro = 1
F1-macro = 1
0.34
(11) where i is the class index and N is the number of classes.
(11) i がクラスインデックスで n がクラスの数です。
0.62
Table 2 contains the F1-macro scores across all three data sets for a 3-class prediction problem which can be calculated across each data set independently.
For example, 1 for data set AR in Sydney contains incidents between 0 − 24min, while 1 for the SF data set contains incidents between 0 − 30min; similarly, the 3 class for the SF data set contains substantial incidents which can reach up to 2,715min (45h) (this is consistently larger than 710min or 595min in Australia).
The F1-macro score is aggregated across all classes, and a low value (below 0.5) indicates that we cannot use a 3-class split for the data set AR (F1-macro=0.35) and M (F1-macro=0.46), but we can do so for the data set SF (F1-macro=0.72).
The significant difference between these data sets is the number of records (584 incident records for the data set AR versus 8,754 records for the data set SF), which may affect model performance.
However, each data set’s specificity seems to dictate the best classification approach to be done and further justifies the need for a more refined regression prediction approach.
[0 − 33%]1 Dataset Data set AR 0-24 min Data set M 0-24 min Data set SF 0-30 min
データセット ar 0-24min データセット m 0-24min データセット sf 0-30min
0.55
[33 − 66%]2 25-44 min 25-54 min 31-71 min
【33-66%】225-44分25-54分31-71分
0.60
[66 − 100%]3 44-710 min 54-598 min 72-2,715 min
[66 − 100%]~3 44-710 min 54-598 min 72-2,715 min
0.39
F1-macro(3-class) 0.35 0.46 0.72
F1-macro(3-class) 0.35 0.46 0.72
0.25
F1 (2-class) 0.79 0.74 0.85
F1(2級) 0.79 0.74 0.85
0.27
Table 2 Multi-class classification results for equally-sized 3-class split
表2 等サイズの3クラス分割のための多クラス分類結果
0.66
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 17 of 27
27ページ17頁。
0.77
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
5. Incident duration prediction using regression: results Thefinalobjectiveofthebi- levelframeworkistopr edictwithanaccuracya ttheminutelevelthele ngthofafreshly reported incident, regardless of its previous classification as either short, medium or large.
Therefore, the second step of the bi-level prediction framework is to develop more advanced regression models that can adjust to each data set independently and over-perform baseline ML models previously used to solve classification problems.
Due to the long tail distribution of incident duration and the class imbalance problem previously identified, we need to design and construct various regression models capable of learning from various types of data sets to make accurate predictions.
However, with limited information(small data setsize), the prediction resultscan be skewed(this effect of prediction skewing will be further discussed).
This section first presents the regression results obtained across severalscenariosofmo deltraining,validati onandtesting,followe dbyresultsofourpropo sedIntra-ExtraOptimi sation algorithm applied over all baseline ML models.
5.1. Regression scenarios results and comparison In order to find the best set-up that works for traffic incident prediction in TMCs, we test various regression scenarios (detailed previously in Section 3.5), which show the extrapolation performance for different ML methods.
The outlier removal procedures (LDO, HDO) together with the classification thresholds (which separate short-term and long-term duration of incidents) are selected as described in Section 4.1-Section 4.2.
The primary purpose of this section is to recommend the best scenario set-up for model training and validation when parts of the data set might be hidden.
Table 3, Table 4 and Table 5 present the MAPE results for all 7 scenarios (All-to-All, AtoA, AtoB, BtoB, BtoA, AlltoA AlltoB) using all the Baseline ML models across all three data sets (and a dedicated winning regression model across each scenario - last column).
1) the improvement from using XGBoost shows the lowest MAPE for scenario AtoA of 49.11 and 67.92 correspondingly (predicting short term incidents only using only short term training information),
The main difference between LGBM and XGBoost results is that LGBM struggles with extrapolation to lower values as seen in scenario B-to-A for all data sets: 292.68% vs 77.66% MAPE for data set A, 663.12% vs 180.77% MAPE for data set M, 166.06% vs 32.62% MAPE for data set SF for LGBM and XGBoost correspondingly.
lgbm と xgboost の主な違いは、すべてのデータセットのシナリオ b-to-a に対して、lgbm は外挿に苦しむことである: 292.68% 対 77.66% データセット a の mape 、 663.12% 対 180.77% データセット m の mape 、 166.06% 対 32.62% データセット sf for lgbm と xgboost の mape である。
0.62
In the SF data set, the LGBM is the best performer reaching a MAPE of 9.34% for the AtoA scenario (which is almost 10 times better than the same scenario for the M data set) and 33.16% MAPE for All-to-All scenario.
sfデータセットでは、lgbmはatoaのシナリオ(mデータセットのほぼ10倍のシナリオ)で9.34%のmapeに到達し、全シナリオで33.16%のmapeに到達している。 訳抜け防止モード: SFデータセットでは、LGBMはAtoAシナリオで9.34%のMAPEに達する最高のパフォーマーである。 Mデータセットと同じシナリオよりも約10倍よい ) と 33.16 % MAPE for All - to - all scenario である。
0.81
This is a significant improvement that reveals what model is adapting to what data set, but most importantly, that each data set reacts differently to the seven scenarios.
Table 3 MAPE results for all 7 scenarios on data set AR
表3 データセットAR上の7つのシナリオすべてに対するMAPE結果
0.79
Scenario AtoA uses short-term traffic accidents (below ) for both training and the prediction.
シナリオAtoAは、トレーニングと予測の両方に短期的な交通事故(下図)を使用する。
0.77
XGBoost shows a significant performance for AR and M data sets compared with other scenarios; more specifically, they outperform by 10% all models in data set AR (MAPE=51.2) and 30% all models in dataset M (MAPE=68.4).
The comparison of scenarios AtoA and AlltoA shows that adding incidents with a longer duration can severely affect the predictionperformanc eacrossalldatasets,r egardlessofthesizeor locationoftheinciden tlogs.
Table 5 MAPE results for all 7 scenarios on data set SF
表5 データセットSF上の7つのシナリオすべてに対するMAPE結果
0.82
performance on data sets AR, M and SF, we need to split the data and use separate models for the short-term incidents as predictions become skewed towards longer incident duration.
データセットAR, M, SFの性能は, より長いインシデント時間に向けて予測が歪むにつれて, データを分割し, 短期インシデントに対して別々のモデルを使用する必要がある。
0.73
Thus, if we predict short-term incidents using only short-term incidents data logs, we obtain a higher accuracy across all datasets.
Scenario AtoB is unique because regression models are trained on Subset A, which contains short-term incident duration logs while they are trying to predict long-term incidents; therefore, the performance is much worse than for AtoA scenario since incidents with long duration are much scarcer and have unique traffic conditions.
BtoB scenario shows lower error than AtoB across all three data sets (e g BtoB provides 23.69% MAPE and AtoB provides 65.53% MAPE for best models for data set SF).
Vice-versa, Scenario BtoA shows very high extrapolation errors across all methods to lower values.
逆のScenario BtoAは、すべてのメソッドで非常に高い外挿エラーを示し、値を下げる。
0.62
Adding short-term incidents into the training set of long-term incidents (when we move from BtoA to AlltoA scenario) significantly reduces the error (76.71% MAPE for scenario AlltoA, data set M using XGBoost), but it is still significantly higher than for AtoA scenario (67.92% MAPE for M data set using XGBoost).
Scenario BtoBshowsbetterperfo rmance(e g MAPE=31.18%fordatasetMusi ngXGBoost)thanusingd ataaddition (suchasthecaseofAllt oB,whereMAPE=34.21%usingbestmodel )oranyextrapolation( asinthecaseofAtoB,wh ere MAPE=68.62% using best model).
シナリオBtoBshowsbetter Performance(e g MAPE=31.18%fordatasetMusi ngXGBoost)thanusingd ataaddition(suchasth ecaseofAlltoB、whereMAPE=34.21%usingbestmodel )oranyextrapolation( asinthecaseofAtoB、where MAPE=68.62%)
0.42
By comparing scenarios AtoB and AlltoB we observe a significant performance improvement when adding data for long-term incidents and predicting subset B (from 63.82% to 31.67% MAPE for dataset AR using best model), where error is still higher than for BtoB (25.03%, AR, best model).
Scenario BtoA shows high prediction errors across all scenarios highlighting a bad extrapolation accuracy when predicting shortterm incidents duration using long-term traffic incident data.
It means that prediction of the duration of short-term incidents should be performed separately from long-term incidents.
つまり、短期的なインシデント期間の予測は、長期インシデントとは別に行うべきである。
0.64
Thus, we can’t use long-term incidents to predict the duration of short-term incidents and vice versa if we are looking at maximising model performance with limited data set; the second reason lies mainly in different traffic behaviour along with severe accidents that can last for several hours which are harder to clear off - these require similar previous events in order to be predicted for their duration.
5.2. Outcomes and recommendations Scenario modelling shows that the baseline ML models are not improving when facing incident duration extrapolation or data addition (e g AtoA versus AlltoA, BtoB versus AlltoB); these two training set-ups badly affect the model performance extrapolating in any direction.
Therefore, it is essential for the bi-level framework and traffic incident duration prediction to use separate models for short-term and long-term traffic incidents.
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
that requires more advanced investigations.
より高度な調査が必要です
0.73
This aspect was the one that motivated our research to further improve and build a better ML framework for any type of incoming data set, and the results of this novel IEO-ML framework are further detailed in the following section.
5.3. Regression results for proposed IEO-ML model In this section, we employ our proposed Intra-extra joint optimisation approach previously presented in Section 3.7 and we further present the results of the All-to-All regression scenario, with a log-transformation of incident duration and several outlier removal techniques such as the LocalOutlierFactor (LOF) and the IsolationForest (IF), previously described in Section 3.6.
Table 6 MAPE results for All-to-All scenario of data set A, using different ORM approaches and incident duration transformation, via the proposed IEO-ML approach.
表 6 mape は、提案された ieo-ml アプローチを介して、異なる orm アプローチとインシデント持続時間変換を使用して、データセット a の全対全シナリオの結果を示す。 訳抜け防止モード: 表 6 MAPE results for All - to - all scenario of data set A, 異なるORMアプローチとインシデント持続時間変換を使用する。 提案されているIEO - MLアプローチを通じて。
0.65
For the data set A (Table 6), we observe a significant impact of using the log-transformation of the incident duration vector via the resulting MAPE (see Unprocessed versus Log columns).
When comparing results across all models, both regular and re-enforced by our IEO approach (column comparison - see Best results), we observe that XGBoost is the best performing baseline model for this data set reaching a 59.4 MAPE.
Furthermore, when comparing results across regular ML models versus our proposed IEO-ML enhancements (row comparison), then the extra optimisation approaches seem to outperform the intra optimisation approaches (see iIF-Log versus eIF-Log and eLOF-Log versus iLOF-Log columns).
The last column indicates the best approach that won across all proposed IEO approaches where for example, eLOF-Log-RF model is read as the extra optimisation method applied together with the Local Outlier Factor and Random Forest over the log scale data transformation; for thisdatasetAresultsi ndicateasimilarperfo rmancebetweenusingba selineMLmodelswithlo gtransformationversu s enhanced IEO-ML - for example the joint optimization provides an improvement (eLOF-log-LightBGM, eLOF-logRF) versus the cases cases when only the baseline ML with the log-transformation was used (e g Log-LR, Log-BDT).
The last column indicates the best approach that won across all proposed IEO approaches where for example, eLOF-Log-RF model is read as the extra optimisation method applied together with the Local Outlier Factor and Random Forest over the log scale data transformation; for thisdatasetAresultsi ndicateasimilarperfo rmancebetweenusingba selineMLmodelswithlo gtransformationversu s enhanced IEO-ML - for example the joint optimization provides an improvement (eLOF-log-LightBGM, eLOF-logRF) versus the cases cases when only the baseline ML with the log-transformation was used (e g Log-LR, Log-BDT). 訳抜け防止モード: 最後の列は、例えば、提案されているすべてのIEOアプローチに勝った最良のアプローチを示している。 このdatasetAresultsindic ateasimilar Performancebetweenus ingbaselineMLmodelsw ithlogtransformation versus enhanced IEO - ML – 例えば、共同最適化は改善(eLOF - log - LightBGM)を提供する。 eLOF - logRF ) 対、ログを持つベースラインMLのみを使用する場合(例えば Log - LR,)。 ログ - BDT。
0.63
However, the A data set is very small and has a special behaviour when compared to the others as further results revealed.
Table 7 MAPE results for All-to-All scenario of data set M, using different ORM approaches and incident duration transformation, via the proposed IEO-ML approach.
For the data set M (Table 7), when we use Log-transformation, we observe very high MAPE scores (100% and higher), except for XGBoost, which provides a MAPE of 78.6%.
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
the IEO enhancements as well (column comparison), using XGboost as a baseline seems to over-perform all the other approaches, with the best results being a MAPE=77.5 for iIF-Log-XGBoost.
When comparing against the proposed approaches (row comparison), the Intra joint optimisation using Isolation Forest in log-transform shows the best performance on this data set for four models (iIF-Log-LightGBM, iIF-Log-LR, iIF-Log-kNN, iIF-Log-XGBoost), which can be attributed to data set data structure - outliers can be better analysed using tree-based outlier removal methods rather than distance-based LOF.
Table 8 MAPE results for All-to-All scenario of data set SF, using different approaches for ORM and incident duration transformation, via the proposed IEO-ML approach.
テーブル8 MAPEは、提案されたIEO-MLアプローチを通じて、ORMとインシデント持続時間変換の異なるアプローチを使用して、データセットSFのオール・ツー・オールシナリオの結果を出力する。 訳抜け防止モード: 表 8 MAPE results for All - to - All scenario of data set SF, ORMとインシデント持続時間変換の異なるアプローチを使用する。 提案されているIEO - MLアプローチを通じて。
0.64
ForthedatasetSF(Tabl e8),weobservetwocomp etingmodels-LightGBM andRandomForestswith aprevalence for Random Forests (column comparison - see Best results).
ForthedatasetSF(Tabl e8),weobservetwocomp etingmodels-LightGBM andRandomForestswith aprevalence for Random Forests (カラムの比較 - ベストな結果を参照)。
0.53
Also, we observe a considerably lower MAPE score for the best performing models which reached the lowest threshold of 28.7 across all the data sets used in this study.
When comparing the IEO approaches (row comparison), the intra joint optimisation shows improvement across three modelsandmorespecificallyforthebestperfo rmingmodelonthisdata set,RF.Oneconsistent findingacrossallresult s is the fact that the log-transformation of the incident duration vector should be used at all times for incident duration prediction since it significantly improves predictions accuracy; this is mostly related to the long tail distribution and extreme outliers which can affect the final errors in the model performance evaluation.
When comparing the IEO approaches (row comparison), the intra joint optimisation shows improvement across three modelsandmorespecificallyforthebestperfo rmingmodelonthisdata set,RF.Oneconsistent findingacrossallresult s is the fact that the log-transformation of the incident duration vector should be used at all times for incident duration prediction since it significantly improves predictions accuracy; this is mostly related to the long tail distribution and extreme outliers which can affect the final errors in the model performance evaluation. 訳抜け防止モード: IEO アプローチの比較(行比較) 1consistentfindingac rossallsults is that that the log - transformation of the incident duration vector is not be always time for incident duration prediction because it improves predictions accuracy; これは主に長い尾の分布と極端な外れ値に関係している。 モデルの性能評価における最終的なエラーに影響を与える可能性がある。
0.75
Overall, the best performing models are considered to be XGBoost and Random Forests.
Overall, we proved that our proposed intra joint optimisation is improving the regression results across multiple data sets (especially data sets M and SF in 7 out of 12 cases).
The joint optimisation of the model together with the outlier removal method shows a significant improvement in majority of cases (12 out of 18) across all three data sets.
6. Feature importance impact and evaluation Finally, we evaluate the feature importance using a Shapley value calculation in order to estimate the contribution of each feature to the final prediction score.
Each point related to a feature is shown in Fig 8 and represents the SHAP value score (Oy-axis), coloured by its value (from low to high),while the Ox-axis shows the impact of that feature information on the entire prediction output.
The hour-of-the-day when the incident started is among the top 5 features sorted by importance (ranked on the 1 place for data set A, 3 for M and 4Ò for SF).
For example, Fig 8a) showcases that as the hour of the day increases (getting closer to midnight) the traffic durations are lower as the congestion is lower and rescue teams arrive faster to the accident location; this is the opposite on the motorways as Fig 8b) reflects that rescue teams havea a harder time reaching the incident location in the evening, which is mostly explained by the high distance of the motorway from the local incident management centre.
The Ox-axis on SHAP plots represents the impact on model output
SHAPプロット上のOx軸はモデル出力に与える影響を表す
0.87
Grigorev et al : Preprint submitted to Elsevier
Grigorev et al : Elsevierに提出されたプレプリント
0.78
Page 21 of 27
27ページ21頁。
0.77
英語(論文から抽出)
日本語訳
スコア
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
Figure 8: Feature importance for All-to-All regression using XGBoost for
図8:XGBoost for All-to-Allレグレッションの特徴
0.77
a) Arterial roads, Sydney, Australia
a) オーストラリア,シドニーの幹線道路
0.65
b) M7 motorway, Sydney, Australia
b)オーストラリアのシドニーにあるm7高速道路
0.67
c) San-Francisco, USA
c) アメリカ合衆国サンフランシスコ
0.85
(e g the effect on the predicted duration value).
(例えば、予測された持続時間値への影響)。
0.72
Even though the average temperature is considered significant, its effect on the regression model output is very small [−5; +5] for data set AR, [−5; +5] for data set M, [−25; +25] for data set SF.
平均温度は重要であると考えられているが、回帰モデル出力に対する影響はデータセット ar に対して非常に小さく、データセット m に対して [−5 ; +5 ]、データセット sf に対して [−25 ; +25 ] である。
0.89
The distance from CBD (DistanceCBD) is important in the data set A, as it can point at some problematic areas, therefore causing a higher incident duration.
The number of affected lanes is also an important feature for incident duration prediction on arterial roads in Sydney.
影響を受ける車線の数は、シドニーの幹線道路で発生期間を予測する重要な特徴である。
0.69
The model outputs for the M7 motorway revealed that is highly dependent on the sector ID (similar to the traffic zones in the data set A), which may be linked to the nature of the location or to the distance from incident management agencies.
The average daily temperature also affects predictions (3 place in A, 7Ò in M and 6Ò in SF).
平均日温は、A では 3 位、M では 7 位、SF では 6 位にも影響する。
0.62
Weather factors (rainfall) are found to play a significant role in the M and SF data sets (humidity and barometric pressure may be predictors of rainfall).
The length of the affected road segment (Distance in SF) may also be an essential feature which is not found in Sydney data sets.
影響を受ける道路区間の長さ(sfでは距離)もシドニーのデータセットには見られない重要な特徴である。
0.69
Overall, the specificity of each data set is reflected once again not only in the models that may be more successful than others but also in the way that the same model can provide various feature importance due to each country, their unique landscape and different way of dealing with the disruptions.
7. CONCLUSIONS This paper proposed a novel bi-level framework for predicting the incident durations via a unique combination of baseline machine learning models (for both classification and regression), together with an outlier removal procedure and a novel intra-extra joint optimisation technique.
The accuracy and importance of the proposed approach has been proved via three different data sets from 2 countries (Australia and United States of America) under several scenarios for testing and validation.
Incident duration prediction using Machine Learning
機械学習を用いたインシデント時間予測
0.80
Major contributions: Firstly, regarding the classification prediction of incidents into short versus long-term: we found that the optimal duration classification thresholds are similar among the three different data sets: 40min for data set AR, 45min for M, 45min for SF.
Sydney TIMS also found 45 minutes to be the threshold for incident removal performance evaluation via their on-the field expertise; this represented a confirmation that our threshold split is in coherence with realistic operational rescue times.
Secondly, the best performing and robust models in the classification and regression experiments were the tree-based models (XGBoost, RandomForest, etc).
Fourthly, our proposed IEO-ML approach outperformed baseline ML models in 12 out of 18 cases (66%) showcasing it’s strong value to the incident duration prediction problem.
Finally, when evaluating the feature importance we showed that features related to time, location, type of the accident, reporting source and weather are among the top 10 critical features in all three data sets.
By improving the precision of the most important and removing non-important features from the incident reports, TIMS can significantly improve the quality of data acquisition.
Future research can be related to the usage of traffic simulation with information on predicted traffic incident duration included in the decision making process during route planning.
For example, the vehicle can consider that a traffic incident is short-term and assume that it will be cleared before arriving at the incident location and therefore reduce its travel time by not planning a route around the incident site.
Furthermore, the cost of prediction error and the benefit of traffic accident duration estimation can be estimated from the simulation model, where occasional traffic accidents happen within traffic flow.
The findings indicate the RF and kNN seem to be the slowest models to train versus LGBM and XGBoost and LR which are faster from a computational time point of view.
A real-time crash prediction fusion framework: An imbalance-aware strategy [2] Alkaabi, A.M.S., Dissanayake, D., Bird, R., 2011.
A real-time crash prediction fusion framework: an im balance-aware strategy [2] Alkaabi, A.M.S., Dissanayake, D., Bird, R., 2011 訳抜け防止モード: リアルタイムクラッシュ予測融合フレームワーク : 不均衡対応戦略 [2]Alkaabi A.M.S., Dissanayake, D., Bird, R., 2011
0.80
Analyzing clearance time of urban traffic accidents in abu dhabi, united arab emirates, with [3] Bekkerman, R., 2015.
アラブ首長国連邦アブダビにおける都市交通事故のクリアランス時間の解析 : [3] bekkerman, r., 2015
0.65
The present and the future of the kdd cup competition: an outsider’s perspective.
kddカップコンペティションの現在と未来:外部からの視点。
0.49
[4] Bergstra, J., Bengio, Y., 2012.
[4] Bergstra, J., Bengio, Y., 2012
0.37
Random search for hyper-parameter optimization.
ハイパーパラメータ最適化のためのランダム探索
0.70
The Journal of Machine Learning Research 13, 281–305.
The Journal of Machine Learning Research 13 281–305(英語)
0.79
[5] Breiman, L., 2001.
5) Breiman, L., 2001。
0.59
Random forests. Mach.
ランダムな森。 マッハ
0.45
Learn. 45, 5–32.
学ぶ。 45, 5–32.
0.54
URL: https://doi.org/10.1 023/A:1010933404324, doi:10.1023/A: [6] Breunig, M., Kriegel, H.P., Ng, R., Sander, J., 2000.
URL: https://doi.org/10.1 023/A:1010933404324, doi:10.1023/A: [6] Breunig, M., Kriegel, H.P., Ng, R., Sander, J., 2000
2000 ACM SIGMOD International [7] Chen, T., Guestrin, C., 2016
0.31
Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on [8] Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., et al , 2015.
Xgboost: 22cmのSigkdd国際会議([8] Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., et al , 2015)の成果。 訳抜け防止モード: xgboost : a scalable tree boosting system, in : 第22回acm sigkdd国際会議報告 [8] chen t.、h.、t.、ベネスティ、m.、ホティロヴィチ v., tang, y., cho, h., et al, 2015年。
0.71
Xgboost: extreme gradient boosting.
Xgboost: 極端な勾配上昇。
0.84
R package version 0.4-2 [9] Chen, T., Shi, X., Wong, Y.D., Yu, X., 2020.
Traffic incident duration prediction based on k-nearest neighbor, in: Applied [43] Yi, D., Su, J., Liu, C., Quddus, M., Chen, W.H., 2019.
交通事故発生期間の予測はk-nearest neighbor, in: Applied [43] Yi, D., Su, J., Liu, C., Quddus, M., Chen, W.H., 2019。
0.87
A machine learning based personalized system for driving state recognition.
運転状態認識のための機械学習に基づくパーソナライズシステム
0.79
Trans[44] Yu, B., Xia, Z., 2012.
Trans[44] Yu, B., Xia, Z., 2012
0.38
A methodology for freeway incident duration prediction using computerized historical database, in: The Twelfth COTA [45] Zhan, C., Gan, A., Hadi, M., 2011.
a methodology for freeway incident duration prediction using computerized historical database, in: the 12fth cota [45] zhan, c., gan, a., hadi, m., 2011 訳抜け防止モード: コンピュータ化された歴史データベースを用いた高速道路事故発生時間予測手法 第12回COTA [45 ]Zhan, C. Gan , A. , Hadi , M. , 2011
0.81
Prediction of lane clearance time of freeway incidents using the m5p tree algorithm.
m5p木アルゴリズムによる高速道路事故の車線クリアランス時間の予測
0.77
IEEE Transactions [46] Zou, Y., Ye, X., Henrickson, K., Tang, J., Wang, Y., 2018.
IEEE Transactions [46] Zou, Y., Ye, X., Henrickson, K., Tang, J., Wang, Y., 2018
0.40
Jointly analyzing freeway traffic incident clearance and response time using a
aを用いた高速道路交通インシデントクリアランスと応答時間の共同分析
0.77
Review 2, 103–111. models.
103-111頁。 モデル。
0.60
Transportation research record 2672, 247–256.
交通調査記録2672, 247–256。
0.74
Mechanics and Materials, Trans Tech Publ.
メカニクスと材料、トランステク。
0.54
pp. 1675–1681.
1675-1681頁。
0.68
portation Research Part C: Emerging Technologies 105, 241–261.
移植研究部C:新興技術105, 241–261。
0.81
International Conference of Transportation Professionals, pp. 3463–3474.
国際運輸専門家会議、3463-3474。
0.53
doi:10.1061/97807844 12442.351.
doi:10.1061/97807844 12442.351
0.16
on Intelligent Transportation Systems 12, 1549–1557.
知的交通システム12、1549-1557。
0.64
doi:10.1109/TITS.201 1.2161634.
doi:10.1109/tits.201 1.2161634
0.13
models. gradient boosting decision trees method.
モデル。 勾配ブースティング決定木法。
0.65
IEEE Transactions on Intelligent Transportation Systems 18, 2303–2310.
ieee transactions on intelligent transportation systems 18, 2303-2310を参照。
0.61
Transactions on Intelligent Transportation Systems .