Water Quality Data Imputation via A Fast Latent Factorization of Tensors with PID-based Optimizer
- URL: http://arxiv.org/abs/2503.06997v1
- Date: Mon, 10 Mar 2025 07:22:54 GMT
- Title: Water Quality Data Imputation via A Fast Latent Factorization of Tensors with PID-based Optimizer
- Authors: Qian Liu, Lan Wang, Bing Yang, Hao Wu,
- Abstract summary: There are numerous missing values in water quality data due to sensor failure.<n>A Latent Factorization of PIDs (LFT) with Gradient Descent (SGD) proves to be an efficient imputation method.<n>This paper proposes a Fast Latent Factorization of PIDs (FLFT) model to tackle this issue.
- Score: 21.261626027956737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Water quality data can supply a substantial decision support for water resources utilization and pollution prevention. However, there are numerous missing values in water quality data due to inescapable factors like sensor failure, thereby leading to biased result for hydrological analysis and failing to support environmental governance decision accurately. A Latent Factorization of Tensors (LFT) with Stochastic Gradient Descent (SGD) proves to be an efficient imputation method. However, a standard SGD-based LFT model commonly surfers from the slow convergence that impairs its efficiency. To tackle this issue, this paper proposes a Fast Latent Factorization of Tensors (FLFT) model. It constructs an adjusted instance error into SGD via leveraging a nonlinear PID controller to incorporates the past, current and future information of prediction error for improving convergence rate. Comparing with state-of-art models in real world datasets, the results of experiment indicate that the FLFT model achieves a better convergence rate and higher accuracy.
Related papers
- A Causal Convolutional Low-rank Representation Model for Imputation of Water Quality Data [11.584987653534531]
This paper proposes a Causal convolutional Low-rank Representation (CLR) model for imputing missing WQD to improve the completeness of the WQD.
Experimental studies on three real-world water quality datasets demonstrate that the proposed CLR model is superior to some of the existing state-of-the-art imputation models.
arXiv Detail & Related papers (2025-04-21T16:27:16Z) - Latent Tensor Factorization with Nonlinear PID Control for Missing Data Recovery in Non-Intrusive Load Monitoring [2.94258758663678]
Non-Intrusive Load Monitoring (NILM) has emerged as a key smart grid technology.
This paper proposes a Proportional-integral-derivative (PID)-Incorporated Latent factorization of tensors (NPIL) model with two-fold ideas.
Experimental results on real-world NILM datasets demonstrate that the proposed NPIL model surpasses state-of-the-art models in convergence rate and accuracy when predicting the missing NILM data.
arXiv Detail & Related papers (2025-04-18T05:48:14Z) - Update hydrological states or meteorological forcings? Comparing data assimilation methods for differentiable hydrologic models [0.923607423080658]
Data assimilation (DA) enables hydrologic models to update their internal states using near-real-time observations for more accurate forecasts.<n>We developed variational DA methods for differentiable models, including optimizing adjusters for just precipitation data.<n>Our DA framework does not need systematic training data and could serve as a practical DA scheme for whole river networks.
arXiv Detail & Related papers (2025-02-23T05:08:05Z) - Gen-DFL: Decision-Focused Generative Learning for Robust Decision Making [48.62706690668867]
Decision-focused generative learning (Gen-DFL) is a novel framework that leverages generative models to adaptively model uncertainty and improve decision quality.
The paper shows, theoretically, that Gen-DFL achieves improved worst-case performance bounds compared to traditional DFL.
arXiv Detail & Related papers (2025-02-08T06:52:11Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model [45.45700202300292]
CaPaint aims to identify causal regions in data and endow model with causal reasoning ability in a two-stage process.
By using a fine-tuned unconditional Diffusion Probabilistic Model (DDPM) as the generative prior, we in-fill the masks defined as environmental parts.
Experiments conducted on five real-world ST benchmarks demonstrate that integrating the CaPaint concept allows models to achieve improvements ranging from 4.3% to 77.3%.
arXiv Detail & Related papers (2024-09-29T08:18:50Z) - SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations [44.92134227376008]
This paper introduces two synergistic techniques, Self-Cleaning Iteration (SCI) and Regression Focal Loss (RFL)
SCI and RFL prove particularly effective in mitigating error propagation, a prevalent issue in optical flow models that employ iterative refinement.
The effectiveness of our proposed SCI and RFL techniques, collectively referred to as SciFlow for brevity, is demonstrated across two distinct lightweight optical flow model architectures in our experiments.
arXiv Detail & Related papers (2024-04-11T21:41:55Z) - EdgeFD: An Edge-Friendly Drift-Aware Fault Diagnosis System for
Industrial IoT [0.0]
We propose the Drift-Aware Weight Consolidation (DAWC) to mitigate the challenges posed by frequent data drift in the industrial Internet of Things (IIoT)
DAWC efficiently manages multiple data drift scenarios, minimizing the need for constant model fine-tuning on edge devices.
We have also developed a comprehensive diagnosis and visualization platform.
arXiv Detail & Related papers (2023-10-07T06:48:07Z) - An Incomplete Tensor Tucker decomposition based Traffic Speed Prediction
Method [0.0]
This work integrates the unique advantages of the proportional-integral-derivative (PID) controller into a Tucker decomposition based LFT model.
Experiments on two major city traffic road speed datasets show that the proposed model achieves significant efficiency gain and highly competitive prediction accuracy.
arXiv Detail & Related papers (2023-04-21T13:59:28Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - A Nonlinear PID-Enhanced Adaptive Latent Factor Analysis Model [6.2303427193075755]
High-dimensional and incomplete (HDI) data holds tremendous interactive information in various industrial applications.
A latent factor (LF) model is remarkably effective in extracting valuable information from HDI data with decent gradient (SGD) algorithm.
An SGD-based LFA model suffers from slow convergence since it only considers the current learning error.
arXiv Detail & Related papers (2022-08-04T07:48:19Z) - Evaluating the Adversarial Robustness for Fourier Neural Operators [78.36413169647408]
Fourier Neural Operator (FNO) was the first to simulate turbulent flow with zero-shot super-resolution.
We generate adversarial examples for FNO based on norm-bounded data input perturbations.
Our results show that the model's robustness degrades rapidly with increasing perturbation levels.
arXiv Detail & Related papers (2022-04-08T19:19:42Z) - Detection of Anomalies in a Time Series Data using InfluxDB and Python [0.0]
This paper demonstrates data cleaning and preparation for time-series data.
It further proposes cost-sensitive machine learning algorithms as a solution to detect anomalous data points in time-series data.
arXiv Detail & Related papers (2020-12-15T17:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.