Prediction of daily maximum ozone levels using Lasso sparse modeling
method
- URL: http://arxiv.org/abs/2010.08909v1
- Date: Sun, 18 Oct 2020 02:58:53 GMT
- Title: Prediction of daily maximum ozone levels using Lasso sparse modeling
method
- Authors: Jiaqing Lv, Xiaohong Xu
- Abstract summary: This paper applies modern statistical methods in the prediction of the next-day maximum ozone concentration.
The model uses a large number of candidate features, including the present day's hourly concentration level of various pollutants, as well as the meteorological variables.
The model trained by 3-years data demonstrates relatively good prediction accuracy, with RMSE= 5.63 ppb, MAE= 4.42 ppb, and RMSE= 5.68 ppb, MAE= 4.52 ppb.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper applies modern statistical methods in the prediction of the
next-day maximum ozone concentration, as well as the maximum 8-hour-mean ozone
concentration of the next day. The model uses a large number of candidate
features, including the present day's hourly concentration level of various
pollutants, as well as the meteorological variables of the present day's
observation and the future day's forecast values. In order to solve such an
ultra-high dimensional problem, the least absolute shrinkage and selection
operator (Lasso) was applied. The $L_1$ nature of this methodology enables the
automatic feature dimension reduction, and a resultant sparse model. The model
trained by 3-years data demonstrates relatively good prediction accuracy, with
RMSE= 5.63 ppb, MAE= 4.42 ppb for predicting the next-day's maximum $O_3$
concentration, and RMSE= 5.68 ppb, MAE= 4.52 ppb for predicting the next-day's
maximum 8-hour-mean $O_3$ concentration. Our modeling approach is also compared
with several other methods recently applied in the field and demonstrates
superiority in the prediction accuracy.
Related papers
- Using remotely sensed data for air pollution assessment [0.0]
The main objective of this work is to create models capable of inferring pollutant concentrations in locations where no observation data exists.
A machine learning model was developed for predicting concentrations in the Iberian Peninsula in 2019 for five selected pollutants.
All models presented acceptable cross-validation RMSE, except the $O_3$ and $PM10$ models where the mean value was a little higher.
arXiv Detail & Related papers (2024-02-04T14:27:28Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - Observation-Guided Meteorological Field Downscaling at Station Scale: A
Benchmark and a New Method [66.80344502790231]
We extend meteorological downscaling to arbitrary scattered station scales and establish a new benchmark and dataset.
Inspired by data assimilation techniques, we integrate observational data into the downscaling process, providing multi-scale observational priors.
Our proposed method outperforms other specially designed baseline models on multiple surface variables.
arXiv Detail & Related papers (2024-01-22T14:02:56Z) - DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z) - Fast parameter estimation of Generalized Extreme Value distribution
using Neural Networks [9.987055028382876]
Generalized extreme-value distribution is a popular choice for modeling extreme events such as floods, droughts, heatwaves, wildfires, etc.
We propose a computationally efficient, likelihood-free estimation method utilizing a neural network.
We show that the proposed neural network-based method provides Generalized Extreme Value (GEV) distribution parameter estimates with comparable accuracy to the conventional maximum likelihood method.
arXiv Detail & Related papers (2023-05-07T17:40:52Z) - A comparative study of statistical and machine learning models on
near-real-time daily emissions prediction [0.0]
The rapid ascent in carbon dioxide emissions is a major cause of global warming and climate change.
This paper aims to select a suitable model to predict the near-real-time daily emissions from January 1st, 2020 to September 30st, 2022 of all sectors in China.
arXiv Detail & Related papers (2023-02-02T15:14:27Z) - Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global
Weather Forecast [91.9372563527801]
We present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
For the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy.
Pangu-Weather supports a wide range of downstream forecast scenarios, including extreme weather forecast and large-member ensemble forecast in real-time.
arXiv Detail & Related papers (2022-11-03T17:19:43Z) - Multivariate Probabilistic Forecasting of Intraday Electricity Prices
using Normalizing Flows [62.997667081978825]
In Germany, the intraday electricity price typically fluctuates around the day-ahead price of the EPEX spot markets in a distinct hourly pattern.
This work proposes a probabilistic modeling approach that models the intraday price difference to the day-ahead contracts.
arXiv Detail & Related papers (2022-05-27T08:38:20Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - A Novel CMAQ-CNN Hybrid Model to Forecast Hourly Surface-Ozone
Concentrations Fourteen Days in Advance [0.19573380763700707]
Currently available numerical modeling systems for air quality predictions can forecast 24 to 48 hours in advance.
We develop a modeling system based on a convolutional neural network (CNN) model that is not only fast but covers a temporal period of two weeks with a resolution as small as a single hour for 255 stations.
Although the primary purpose of this study is the prediction of hourly ozone concentrations, the system can be extended to various other pollutants.
arXiv Detail & Related papers (2020-08-13T16:02:05Z) - Estimating Basis Functions in Massive Fields under the Spatial Mixed
Effects Model [8.528384027684194]
For massive datasets, fixed rank kriging using the Expectation-Maximization (EM) algorithm for estimation has been proposed as an alternative to the usual but computationally prohibitive kriging method.
We develop an alternative method that utilizes the Spatial Mixed Effects (SME) model, but allows for additional flexibility by estimating the range of the spatial dependence between the observations and the knots via an Alternating Expectation Conditional Maximization (AECM) algorithm.
Experiments show that our methodology improves estimation without sacrificing prediction accuracy while also minimizing the additional computational burden of extra parameter estimation.
arXiv Detail & Related papers (2020-03-12T19:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.