The NetMob25 Dataset: A High-resolution Multi-layered View of Individual Mobility in Greater Paris Region
- URL: http://arxiv.org/abs/2506.05903v1
- Date: Fri, 06 Jun 2025 09:22:21 GMT
- Title: The NetMob25 Dataset: A High-resolution Multi-layered View of Individual Mobility in Greater Paris Region
- Authors: Alexandre Chasse, Anne J. Kouam, Aline C. Viana, Razvan Stanica, Wellington V. Lobato, Geymerson Ramos, Geoffrey Deperle, Abdelmounaim Bouroudi, Suzanne Bussod, Fernando Molano,
- Abstract summary: This paper describes the survey design, collection protocol, processing methodology, and characteristics of the released dataset.<n>The dataset includes three components: (i) an Individuals database describing demographic, socioeconomic, and household characteristics; (ii) a Trips database with over 80,000 annotated displacements including timestamps, transport modes, and trip purposes; and (iii) a Raw GPS Traces database comprising about 500 million high-frequency points.
- Score: 64.30214722988666
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-quality mobility data remains scarce despite growing interest from researchers and urban stakeholders in understanding individual-level movement patterns. The Netmob25 Data Challenge addresses this gap by releasing a unique GPS-based mobility dataset derived from the EMG 2023 GNSS-based mobility survey conducted in the Ile-de-France region (Greater Paris area), France. This dataset captures detailed daily mobility over a full week for 3,337 volunteer residents aged 16 to 80, collected between October 2022 and May 2023. Each participant was equipped with a dedicated GPS tracking device configured to record location points every 2-3 seconds and was asked to maintain a digital or paper logbook of their trips. All inferred mobility traces were algorithmically processed and validated through follow-up phone interviews. The dataset includes three components: (i) an Individuals database describing demographic, socioeconomic, and household characteristics; (ii) a Trips database with over 80,000 annotated displacements including timestamps, transport modes, and trip purposes; and (iii) a Raw GPS Traces database comprising about 500 million high-frequency points. A statistical weighting mechanism is provided to support population-level estimates. An extensive anonymization pipeline was applied to the GPS traces to ensure GDPR compliance while preserving analytical value. Access to the dataset requires acceptance of the challenge's Terms and Conditions and signing a Non-Disclosure Agreement. This paper describes the survey design, collection protocol, processing methodology, and characteristics of the released dataset.
Related papers
- Extracting Insights from Large-Scale Telematics Data for ITS Applications: Lessons and Recommendations [0.0]
Transportation planners have previously utilized telematics data in various forms, but its current scale offers significant new opportunities.<n>This paper takes a step towards addressing these needs through four primary objectives.<n>First, a data processing pipeline was built to efficiently analyze 1.4 billion miles (120 million trips) of telematics data collected in Virginia between August 2021 and August 2022.<n>Second, an open data repository of trip and roadway segment level summaries was created.<n>Third, interactive visualization tools were designed to extract insights from these data about trip-taking behavior and the speed profiles of roadways.
arXiv Detail & Related papers (2025-07-18T14:09:40Z) - Beyond 9-to-5: A Generative Model for Augmenting Mobility Data of Underrepresented Shift Workers [12.610498232333871]
Shift workers comprise 15-20% of the workforce in industrialized societies.<n>Our approach generates complete, behaviorally valid activity patterns for individuals working non-standard hours.<n>By transforming incomplete GPS traces into complete, representative activity patterns, our approach provides transportation planners with a powerful data augmentation tool.
arXiv Detail & Related papers (2025-07-17T02:33:30Z) - EMT: A Visual Multi-Task Benchmark Dataset for Autonomous Driving [8.97091577113286]
Emirates Multi-Task dataset is designed to support multi-task benchmarking within a unified framework.<n>It comprises over 30,000 frames from a dash-camera perspective and 570,000 annotated bounding boxes, covering approximately 150 kilometers of driving routes.
arXiv Detail & Related papers (2025-02-26T16:06:35Z) - Enhancing stop location detection for incomplete urban mobility datasets [0.0]
This study investigates the application of classification algorithms to enhance density-based methods for stop identification.
Our approach incorporates multiple features, including individual routine behavior across various time and scales local characteristics of individual GPS points.
arXiv Detail & Related papers (2024-07-16T10:41:08Z) - Reconsidering utility: unveiling the limitations of synthetic mobility data generation algorithms in real-life scenarios [49.1574468325115]
We evaluate the utility of five state-of-the-art synthesis approaches in terms of real-world applicability.
We focus on so-called trip data that encode fine granular urban movements such as GPS-tracked taxi rides.
One model fails to produce data within reasonable time and another generates too many jumps to meet the requirements for map matching.
arXiv Detail & Related papers (2024-07-03T16:08:05Z) - Reconstructing human activities via coupling mobile phone data with
location-based social networks [20.303827107229445]
We propose a data analysis framework to identify user's activity via coupling the mobile phone data with location-based social networks (LBSN) data.
We reconstruct the activity chains of 1,000,000 active mobile phone users and analyze the temporal and spatial characteristics of each activity type.
arXiv Detail & Related papers (2023-06-06T06:37:14Z) - Traffic4cast at NeurIPS 2022 -- Predict Dynamics along Graph Edges from
Sparse Node Data: Whole City Traffic and ETA from Stationary Vehicle
Detectors [25.857884532427292]
Traffic4cast is a competition series that advances machine learning for modeling complex spatial systems over time.
Our dynamic road graph data combine information from road maps, $1012$ probe data points, and stationary vehicle detectors in three cities over the span of two years.
In the core challenge, participants are invited to predict the likelihoods of three congestion classes derived from the speed levels in the GPS data for the entire road graph in three cities 15 min into the future.
For the extended challenge, participants are tasked to predict the average travel times on super-segments 15 min into the future.
arXiv Detail & Related papers (2023-03-14T10:03:37Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - Pseudo-PFLOW: Development of nationwide synthetic open dataset for
people movement based on limited travel survey and open statistical data [4.243926243206826]
People flow data are utilized in diverse fields such as urban and commercial planning and disaster management.
This study developed pseudo-people-flow data covering all of Japan by combining public statistical and travel survey data.
arXiv Detail & Related papers (2022-05-02T05:13:53Z) - Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer
Learning in Gridded Geo-Spatial Processes [61.16854022482186]
The IARAI Traffic4cast competitions at NeurIPS 2019 and 2020 showed that neural networks can successfully predict future traffic conditions 1 hour into the future.
U-Nets proved to be the winning architecture, demonstrating an ability to extract relevant features in this complex real-world geo-spatial process.
The competition now covers ten cities over 2 years, providing data compiled from over 1012 GPS probe data.
arXiv Detail & Related papers (2022-03-31T14:40:01Z) - Large Scale Interactive Motion Forecasting for Autonomous Driving : The
Waymo Open Motion Dataset [84.3946567650148]
With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways.
We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent.
We introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models.
arXiv Detail & Related papers (2021-04-20T17:19:05Z) - Urban Sensing based on Mobile Phone Data: Approaches, Applications and
Challenges [67.71975391801257]
Much concern in mobile data analysis is related to human beings and their behaviours.
This work aims to review the methods and techniques that have been implemented to discover knowledge from mobile phone data.
arXiv Detail & Related papers (2020-08-29T15:14:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.