A Data-Driven Supervised Machine Learning Approach to Estimating Global
Ambient Air Pollution Concentrations With Associated Prediction Intervals
- URL: http://arxiv.org/abs/2402.10248v1
- Date: Thu, 15 Feb 2024 11:09:22 GMT
- Title: A Data-Driven Supervised Machine Learning Approach to Estimating Global
Ambient Air Pollution Concentrations With Associated Prediction Intervals
- Authors: Liam J Berrisford, Hugo Barbosa, Ronaldo Menezes
- Abstract summary: We have developed a scalable, data-driven, supervised machine learning framework to impute missing temporal and spatial measurements.
This model is designed to impute missing temporal and spatial measurements, thereby generating a comprehensive dataset for pollutants including NO$, O$_3$, PM$_10$, PM$_2.5$, and SO$.
The model's performance across various geographical locations is examined, providing insights and recommendations for strategic placement of future monitoring stations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Global ambient air pollution, a transboundary challenge, is typically
addressed through interventions relying on data from spatially sparse and
heterogeneously placed monitoring stations. These stations often encounter
temporal data gaps due to issues such as power outages. In response, we have
developed a scalable, data-driven, supervised machine learning framework. This
model is designed to impute missing temporal and spatial measurements, thereby
generating a comprehensive dataset for pollutants including NO$_2$, O$_3$,
PM$_{10}$, PM$_{2.5}$, and SO$_2$. The dataset, with a fine granularity of
0.25$^{\circ}$ at hourly intervals and accompanied by prediction intervals for
each estimate, caters to a wide range of stakeholders relying on outdoor air
pollution data for downstream assessments. This enables more detailed studies.
Additionally, the model's performance across various geographical locations is
examined, providing insights and recommendations for strategic placement of
future monitoring stations to further enhance the model's accuracy.
Related papers
- Generating Fine-Grained Causality in Climate Time Series Data for Forecasting and Anomaly Detection [67.40407388422514]
We design a conceptual fine-grained causal model named TBN Granger Causality.
Second, we propose an end-to-end deep generative model called TacSas, which discovers TBN Granger Causality in a generative manner.
We test TacSas on climate benchmark ERA5 for climate forecasting and the extreme weather benchmark of NOAA for extreme weather alerts.
arXiv Detail & Related papers (2024-08-08T06:47:21Z) - Urban Air Pollution Forecasting: a Machine Learning Approach leveraging Satellite Observations and Meteorological Forecasts [0.11249583407496218]
Air pollution poses a significant threat to public health and well-being, particularly in urban areas.
This study introduces a series of machine-learning models that integrate data from the Sentinel-5P satellite, meteorological conditions, and topological characteristics to forecast future levels of five major pollutants.
arXiv Detail & Related papers (2024-05-30T10:02:53Z) - Back to the Future: GNN-based NO$_2$ Forecasting via Future Covariates [49.93577170464313]
We deal with air quality observations in a city-wide network of ground monitoring stations.
We propose a conditioning block that embeds past and future covariates into the current observations.
We find that conditioning on future weather information has a greater impact than considering past traffic conditions.
arXiv Detail & Related papers (2024-04-08T09:13:16Z) - Using remotely sensed data for air pollution assessment [0.0]
The main objective of this work is to create models capable of inferring pollutant concentrations in locations where no observation data exists.
A machine learning model was developed for predicting concentrations in the Iberian Peninsula in 2019 for five selected pollutants.
All models presented acceptable cross-validation RMSE, except the $O_3$ and $PM10$ models where the mean value was a little higher.
arXiv Detail & Related papers (2024-02-04T14:27:28Z) - Observation-Guided Meteorological Field Downscaling at Station Scale: A
Benchmark and a New Method [66.80344502790231]
We extend meteorological downscaling to arbitrary scattered station scales and establish a new benchmark and dataset.
Inspired by data assimilation techniques, we integrate observational data into the downscaling process, providing multi-scale observational priors.
Our proposed method outperforms other specially designed baseline models on multiple surface variables.
arXiv Detail & Related papers (2024-01-22T14:02:56Z) - Spatial-temporal Forecasting for Regions without Observations [13.805203053973772]
We study spatial-temporal forecasting for a region of interest without any historical observations.
We propose a model named STSM for the task.
Our key insight is to learn from the locations that resemble those in the region of interest.
arXiv Detail & Related papers (2024-01-19T06:26:05Z) - A Framework for Scalable Ambient Air Pollution Concentration Estimation [0.0]
Ambient air pollution remains a critical issue in the United Kingdom, where data on air pollution concentrations form the foundation for interventions aimed at improving air quality.
We introduce a data-driven supervised machine learning model framework designed to address temporal and spatial data gaps by filling missing measurements.
This approach provides a comprehensive dataset for England throughout 2018 at a 1kmx1km hourly resolution.
arXiv Detail & Related papers (2024-01-16T18:03:07Z) - Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs.
Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative.
The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - Predicting Future Occupancy Grids in Dynamic Environment with
Spatio-Temporal Learning [63.25627328308978]
We propose a-temporal prediction network pipeline to generate future occupancy predictions.
Compared to current SOTA, our approach predicts occupancy for a longer horizon of 3 seconds.
We publicly release our grid occupancy dataset based on nulis to support further research.
arXiv Detail & Related papers (2022-05-06T13:45:32Z) - A deep mixture density network for outlier-corrected interpolation of
crowd-sourced weather data [3.1542695050861544]
We present a deep learning approach for Bayesian-temporal modelling of environmental variables with automatic detection.
For our example application, we use the Met Office's Weather Observation Website data, an archive of observations from around 1900 privately run and unofficial weather stations across the British Isles.
arXiv Detail & Related papers (2022-01-25T18:54:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.