A Supervised Machine Learning Model For Imputing Missing Boarding Stops
In Smart Card Data
- URL: http://arxiv.org/abs/2003.05285v2
- Date: Thu, 9 Sep 2021 07:15:33 GMT
- Title: A Supervised Machine Learning Model For Imputing Missing Boarding Stops
In Smart Card Data
- Authors: Nadav Shalit, Michael Fire and Eran Ben-Elia
- Abstract summary: We develop a supervised machine learning method to impute missing boarding stops based on ordinal classification.
Results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Public transport has become an essential part of urban existence with
increased population densities and environmental awareness. Large quantities of
data are currently generated, allowing for more robust methods to understand
travel behavior by harvesting smart card usage. However, public transport
datasets suffer from data integrity problems; boarding stop information may be
missing due to imperfect acquirement processes or inadequate reporting. We
developed a supervised machine learning method to impute missing boarding stops
based on ordinal classification using GTFS timetable, smart card, and
geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate
algorithms where classes have an ordinal nature. Results are based on a case
study in the city of Beer Sheva, Israel, consisting of one month of smart card
data. We show that our proposed method is robust to irregular travelers and
significantly outperforms well-known imputation methods without the need to
mine any additional datasets. Validation of data from another Israeli city
using transfer learning shows the presented model is general and context-free.
The implications for transportation planning and travel behavior research are
further discussed.
Related papers
- A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - Learning Citywide Patterns of Life from Trajectory Monitoring [0.0]
We learn patterns of life by monitoring a data stream for anomalies and explicitly extracting normal patterns over time.
We mine patterns-of-interest from the Porto taxi dataset, including both major public holidays and newly-discovered transportation anomalies.
We anticipate that the capability to incrementally learn normal and abnormal road transportation behavior will be useful in many domains, including smart cities, autonomous vehicles, and urban planning and management.
arXiv Detail & Related papers (2022-06-30T15:28:15Z) - Meta-Learning over Time for Destination Prediction Tasks [53.12827614887103]
A need to understand and predict vehicles' behavior underlies both public and private goals in the transportation domain.
Recent studies have found, at best, only marginal improvements in predictive performance from incorporating temporal information.
We propose an approach based on hypernetworks, in which a neural network learns to change its own weights in response to an input.
arXiv Detail & Related papers (2022-06-29T17:58:12Z) - Predicting Seriousness of Injury in a Traffic Accident: A New Imbalanced
Dataset and Benchmark [62.997667081978825]
The paper introduces a new dataset to assess the performance of machine learning algorithms in the prediction of the seriousness of injury in a traffic accident.
The dataset is created by aggregating publicly available datasets from the UK Department for Transport.
arXiv Detail & Related papers (2022-05-20T21:15:26Z) - Feel Old Yet? Updating Mode of Transportation Distributions from Travel
Surveys using Data Fusion with Mobile Phone Data [0.0]
Transport systems typically rely on traditional data sources providing outdated mode-of-travel data.
We propose a method that leverages mobile phone data as a cost-effective rich source of geospatial information.
Our analysis revealed significant changes in transportation patterns between 2012 and 2020 in Santiago, Chile.
arXiv Detail & Related papers (2022-04-20T14:27:58Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - A Data-Driven Analytical Framework of Estimating Multimodal Travel
Demand Patterns using Mobile Device Location Data [5.902556437760098]
This paper presents a data-driven analytical framework to extract multimodal travel demand patterns from smartphone location data.
A jointly trained single-layer model and deep neural network for travel mode imputation is developed.
The framework also incorporates the multimodal transportation network in order to evaluate the closeness of trip routes to the nearby rail, metro, highway and bus lines.
arXiv Detail & Related papers (2020-12-08T22:49:44Z) - Deploying machine learning to assist digital humanitarians: making image
annotation in OpenStreetMap more efficient [72.44260113860061]
We propose an interactive method to support and optimize the work of volunteers in OpenStreetMap.
The proposed method greatly reduces the amount of data that the volunteers of OSM need to verify/correct.
arXiv Detail & Related papers (2020-09-17T10:05:30Z) - Hidden Footprints: Learning Contextual Walkability from 3D Human Trails [70.01257397390361]
Current datasets only tell you where people are, not where they could be.
We first augment the set of valid, labeled walkable regions by propagating person observations between images, utilizing 3D information to create what we call hidden footprints.
We devise a training strategy designed for such sparse labels, combining a class-balanced classification loss with a contextual adversarial loss.
arXiv Detail & Related papers (2020-08-19T23:19:08Z) - Leveraging the Self-Transition Probability of Ordinal Pattern Transition
Graph for Transportation Mode Classification [0.0]
We propose the use of a feature retained from the Ordinal Pattern Transition Graph, called the probability of self-transition for transportation mode classification.
The proposed feature presents better accuracy results than Permutation Entropy and Statistical Complexity, even when these two are combined.
arXiv Detail & Related papers (2020-07-16T23:25:09Z) - Station-to-User Transfer Learning: Towards Explainable User Clustering
Through Latent Trip Signatures Using Tidal-Regularized Non-Negative Matrix
Factorization [4.713006935605146]
This work focuses on mobility data and how it will help improve our understanding of urban mobility patterns.
We propose a Collective Learning Framework through Latent Representation, which augments user-level learning with collective patterns learned from station-level signals.
We provide a qualitative analysis of the station functions and user profiles for the Washington D.C. metro and show how our method supports intra-city mobility exploration.
arXiv Detail & Related papers (2020-04-27T14:13:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.