Related papers: Deep Particulate Matter Forecasting Model Using Correntropy-Induced Loss

Deep Particulate Matter Forecasting Model Using Correntropy-Induced Loss

URL: http://arxiv.org/abs/2106.03032v1
Date: Sun, 6 Jun 2021 05:17:24 GMT
Title: Deep Particulate Matter Forecasting Model Using Correntropy-Induced Loss
Authors: Jongsu Kim and Changhoon Lee
Abstract summary: The maximum correntropy criterion for regression (MCCR) loss is used in an analysis of the statistical characteristics of air pollution and weather data. The MCCR loss is more appropriate than the conventional mean squared error loss for forecasting extreme values.
Score: 1.7797683504485504
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Forecasting the particulate matter (PM) concentration in South Korea has become urgently necessary owing to its strong negative impact on human life. In most statistical or machine learning methods, independent and identically distributed data, for example, a Gaussian distribution, are assumed; however, time series such as air pollution and weather data do not meet this assumption. In this study, the maximum correntropy criterion for regression (MCCR) loss is used in an analysis of the statistical characteristics of air pollution and weather data. Rigorous seasonality adjustment of the air pollution and weather data was performed because of their complex seasonality patterns and the heavy-tailed distribution of data even after deseasonalization. The MCCR loss was applied to multiple models including conventional statistical models and state-of-the-art machine learning models. The results show that the MCCR loss is more appropriate than the conventional mean squared error loss for forecasting extreme values.

Related papers

An Analysis of Temporal Dropout in Earth Observation Time Series for Regression Tasks [4.707950656037167]
We introduce Monte Carlo Temporal Dropout (MC-TD), a method that explicitly accounts for input-level uncertainty by randomly dropping time-steps during inference. We extend this approach with Monte Carlo Concrete Temporal Dropout (MC-ConcTD), a method that learns the optimal dropout distribution directly. Experiments on three EO time-series datasets demonstrate that MC-ConcTD improves predictive performance and uncertainty calibration compared to existing approaches.
arXiv Detail & Related papers (2025-04-09T14:23:04Z)
RDPI: A Refine Diffusion Probability Generation Method for Spatiotemporal Data Imputation [4.251739849724956]
imputation plays a crucial role in various fields such as traffic flow monitoring, air quality assessment and climate prediction. Data collected by sensors often suffer from temporal incompleteness, and the accumulation and uneven distribution leads to missing data. We propose a novel two-stage refined probability imputation framework based on an initial network and a conditional diffusion model.
arXiv Detail & Related papers (2024-12-17T08:06:00Z)
Using Generative Models to Produce Realistic Populations of the United Kingdom Windstorms [0.0]
dissertation explores the application of generative models to produce realistic synthetic wind field data. Three models, including standard GANs, WGAN-GP, and U-net diffusion models, were employed to generate wind maps of the UK. The results reveal that while all models are effective in capturing the general spatial characteristics, each model exhibits distinct strengths and weaknesses.
arXiv Detail & Related papers (2024-09-16T19:53:33Z)
Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting. We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z)
Conditional diffusion models for downscaling & bias correction of Earth system model precipitation [1.5193424827619018]
We propose a novel machine learning framework for simultaneous bias correction and downscaling. Our approach ensures statistical fidelity, preserves large-scale spatial patterns and outperforms existing methods.
arXiv Detail & Related papers (2024-04-05T11:01:50Z)
A Generative Deep Learning Approach for Crash Severity Modeling with Imbalanced Data [6.169163527464771]
This study proposes a crash data generation method based on Conditional Tabular GAN. A crash severity model is employed to estimate the performance of classification and interpretation. The results indicate that using synthetic data generated by CTGAN-RU for crash severity modeling outperforms original data or synthetic data generated by other resampling methods.
arXiv Detail & Related papers (2024-04-02T16:07:27Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast. We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples. Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z)
Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs. Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative. The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z)
An Asymmetric Loss with Anomaly Detection LSTM Framework for Power Consumption Prediction [1.6156983514505385]
Power consumption patterns of the residential sector contain fluctuations and anomalies making them challenging to predict. We propose multiple Long Short-Term Memory (LSTM) frameworks with different asymmetric loss functions to impose a higher penalty on underpredictions. Considering the effect of weather and social factors, seasonality splitting is performed on the three considered datasets from France, Germany, and Hungary.
arXiv Detail & Related papers (2023-02-05T17:16:15Z)
DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility. Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z)
Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn. We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z)
Modeling Atmospheric Data and Identifying Dynamics: Temporal Data-Driven Modeling of Air Pollutants [2.578242050187029]
We present an empirical approach using data-driven techniques to study air quality in Madrid. We find parsimonious systems of ordinary differential equations that model the concentration of pollutants and their changes over time. Our results show that Akaike's Information Criterion can work well in conjunction with best subset regression as to find an equilibrium between sparsity and goodness of fit.
arXiv Detail & Related papers (2020-10-13T16:46:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.