Optimized Three Deep Learning Models Based-PSO Hyperparameters for
Beijing PM2.5 Prediction
- URL: http://arxiv.org/abs/2306.07296v1
- Date: Sat, 10 Jun 2023 16:06:44 GMT
- Title: Optimized Three Deep Learning Models Based-PSO Hyperparameters for
Beijing PM2.5 Prediction
- Authors: Andri Pranolo, Yingchi Mao, Aji Prasetya Wibawa, Agung Bella Putra
Utama, Felix Andika Dwiyanto
- Abstract summary: This research attempts to optimize the deep learning architecture of Long short term memory (LSTM), Convolutional neural network (CNN), and Multilayer perceptron (MLP)
Beijing PM2.5 datasets was analyzed to measure the performance of the proposed models.
A recommendation for air pollution management could be generated by using these optimized models.
- Score: 1.3649494534428745
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep learning is a machine learning approach that produces excellent
performance in various applications, including natural language processing,
image identification, and forecasting. Deep learning network performance
depends on the hyperparameter settings. This research attempts to optimize the
deep learning architecture of Long short term memory (LSTM), Convolutional
neural network (CNN), and Multilayer perceptron (MLP) for forecasting tasks
using Particle swarm optimization (PSO), a swarm intelligence-based
metaheuristic optimization methodology: Proposed M-1 (PSO-LSTM), M-2 (PSO-CNN),
and M-3 (PSO-MLP). Beijing PM2.5 datasets was analyzed to measure the
performance of the proposed models. PM2.5 as a target variable was affected by
dew point, pressure, temperature, cumulated wind speed, hours of snow, and
hours of rain. The deep learning network inputs consist of three different
scenarios: daily, weekly, and monthly. The results show that the proposed M-1
with three hidden layers produces the best results of RMSE and MAPE compared to
the proposed M-2, M-3, and all the baselines. A recommendation for air
pollution management could be generated by using these optimized models
Related papers
- MiniCPM4: Ultra-Efficient LLMs on End Devices [124.73631357883228]
MiniCPM4 is a highly efficient large language model (LLM) designed explicitly for end-side devices.<n>We achieve this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.<n>MiniCPM4 is available in two versions, with 0.5B and 8B parameters, respectively.
arXiv Detail & Related papers (2025-06-09T16:16:50Z) - Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models [2.3949320404005436]
Particle Swarm Optimization and Large Language Models (LLMs) have been individually applied in optimization and deep learning.
Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence.
Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions.
arXiv Detail & Related papers (2025-04-19T00:54:59Z) - DeepFEA: Deep Learning for Prediction of Transient Finite Element Analysis Solutions [2.9784611307466187]
Finite Element Analysis (FEA) is a powerful but computationally intensive method for simulating physical phenomena.
Recent advancements in machine learning have led to surrogate models capable of accelerating FEA.
Motivated by this research gap, this study proposes DeepFEA, a deep learning-based framework.
arXiv Detail & Related papers (2024-12-05T12:46:18Z) - EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference [49.94169109038806]
This paper introduces EPS-MoE, a novel expert pipeline scheduler for MoE.
Our results demonstrate an average 21% improvement in prefill throughput over existing parallel inference methods.
arXiv Detail & Related papers (2024-10-16T05:17:49Z) - Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more-efficient metric for performance estimation.
We extend the power law analytical function to predict domain-specific pre-training loss based on FLOPs across data sources.
We employ a two-layer neural network to model the non-linear relationship between multiple domain-specific loss and downstream performance.
arXiv Detail & Related papers (2024-10-11T04:57:48Z) - Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models.
We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z) - Introducing a Deep Neural Network-based Model Predictive Control
Framework for Rapid Controller Implementation [41.38091115195305]
This work presents the experimental implementation of a deep neural network (DNN) based nonlinear MPC for Homogeneous Charge Compression Ignition (HCCI) combustion control.
Using the acados software package to enable the real-time implementation of the MPC on an ARM Cortex A72, the optimization calculations are completed within 1.4 ms.
The IMEP trajectory following of the developed controller was excellent, with a root-mean-square error of 0.133 bar, in addition to observing process constraints.
arXiv Detail & Related papers (2023-10-12T15:03:50Z) - Deep Learning for Day Forecasts from Sparse Observations [60.041805328514876]
Deep neural networks offer an alternative paradigm for modeling weather conditions.
MetNet-3 learns from both dense and sparse data sensors and makes predictions up to 24 hours ahead for precipitation, wind, temperature and dew point.
MetNet-3 has a high temporal and spatial resolution, respectively, up to 2 minutes and 1 km as well as a low operational latency.
arXiv Detail & Related papers (2023-06-06T07:07:54Z) - Improving Urban Flood Prediction using LSTM-DeepLabv3+ and Bayesian
Optimization with Spatiotemporal feature fusion [7.790241122137617]
This study presented a CNN-RNN hybrid feature fusion modelling approach for urban flood prediction.
It integrated the strengths of CNNs in processing spatial features and RNNs in analyzing different dimensions of time sequences.
arXiv Detail & Related papers (2023-04-19T22:00:04Z) - Development, Optimization, and Deployment of Thermal Forward Vision
Systems for Advance Vehicular Applications on Edge Devices [0.3058685580689604]
We have proposed a thermal tiny-YOLO multi-class object detection (TTYMOD) system as a smart forward sensing system using an end-to-end YOLO deep learning framework.
The system is trained on large-scale thermal public as well as newly gathered novel open-sourced dataset comprising of more than 35,000 distinct thermal frames.
The efficacy of a thermally tuned nano network is quantified using various qualitative metrics which include mean precision, frames per second rate, and average inference time.
arXiv Detail & Related papers (2023-01-18T15:45:33Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch.
We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z) - Particle Swarm Optimized Federated Learning For Industrial IoT and Smart
City Services [9.693848515371268]
We propose a Particle Swarm Optimization (PSO)-based technique to optimize the hyper parameter settings for the local Machine Learning models.
We evaluate the performance of our proposed technique using two case studies.
arXiv Detail & Related papers (2020-09-05T16:20:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.