Latent Factorization of Tensors with Threshold Distance Weighted Loss for Traffic Data Estimation
- URL: http://arxiv.org/abs/2506.22441v1
- Date: Wed, 11 Jun 2025 05:36:13 GMT
- Title: Latent Factorization of Tensors with Threshold Distance Weighted Loss for Traffic Data Estimation
- Authors: Lei Yang,
- Abstract summary: In real-word traffic data collection processes, issues such as communication failures often lead to incomplete or corrupted datasets.<n>Latent factorization of outliers (LFT) model has emerged as widely adopted and effective solution.<n>This paper proposes a threshold distance weighted (TDW) loss sensitivity-ind Latent Factorization of outliers (TDFTWL) model.<n>The proposed TDFTWL model consistently outperforms state-of-the-art approaches in terms of both accuracy and computational efficiency.
- Score: 4.079031335530995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligent transportation systems (ITS) rely heavily on complete and high-quality spatiotemporal traffic data to achieve optimal performance. Nevertheless, in real-word traffic data collection processes, issues such as communication failures and sensor malfunctions often lead to incomplete or corrupted datasets, thereby posing significant challenges to the advancement of ITS. Among various methods for imputing missing spatiotemporal traffic data, the latent factorization of tensors (LFT) model has emerged as a widely adopted and effective solution. However, conventional LFT models typically employ the standard L2-norm in their learning objective, which makes them vulnerable to the influence of outliers. To overcome this limitation, this paper proposes a threshold distance weighted (TDW) loss-incorporated Latent Factorization of Tensors (TDWLFT) model. The proposed loss function effectively reduces the model's sensitivity to outliers by assigning differentiated weights to individual samples. Extensive experiments conducted on two traffic speed datasets sourced from diverse urban environments confirm that the proposed TDWLFT model consistently outperforms state-of-the-art approaches in terms of both in both prediction accuracy and computational efficiency.
Related papers
- Robust Tensor Completion via Gradient Tensor Nulclear L1-L2 Norm for Traffic Data Recovery [14.96194593196997]
We propose a Robust Completion via Nuclear L1-L2 Norm (RTC-NL) model, which exploits both global low-rankness and local consistency without trade-off parameter, but also effectively handles the dual challenges of missing data and noise in traffic data.
arXiv Detail & Related papers (2025-06-28T02:38:01Z) - Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models [66.57755931421285]
Large-scale artificial intelligence (LAI) models pose significant challenges for real-time communication scenarios.<n>This paper proposes utilizing knowledge distillation (KD) techniques to extract and condense knowledge from LAI models.<n>We propose a fast distillation method featuring a pre-stored compression mechanism that eliminates the need for repetitive inference.
arXiv Detail & Related papers (2025-06-16T08:42:16Z) - A Semantic-Loss Function Modeling Framework With Task-Oriented Machine Learning Perspectives [26.82506860792313]
The performance of data-driven Earth Observation (EO) applications is heavily influenced by the data collection and transmission processes.<n>Adopting the concepts of Semantic Communication (SC) offers a promising solution by prioritizing the transmission of essential data semantics over raw information.<n>This work proposes a novel data-fitting framework to empirically model the semantic loss using real-world EO datasets and domain-specific insights.
arXiv Detail & Related papers (2025-03-12T23:45:11Z) - A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops [55.07063067759609]
High-quality data is essential for training large generative models, yet the vast reservoir of real data available online has become nearly depleted.<n>Models increasingly generate their own data for further training, forming Self-consuming Training Loops (STLs)<n>Some models degrade or even collapse, while others successfully avoid these failures, leaving a significant gap in theoretical understanding.
arXiv Detail & Related papers (2025-02-26T06:18:13Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - An Incomplete Tensor Tucker decomposition based Traffic Speed Prediction
Method [0.0]
This work integrates the unique advantages of the proportional-integral-derivative (PID) controller into a Tucker decomposition based LFT model.
Experiments on two major city traffic road speed datasets show that the proposed model achieves significant efficiency gain and highly competitive prediction accuracy.
arXiv Detail & Related papers (2023-04-21T13:59:28Z) - Truncated tensor Schatten p-norm based approach for spatiotemporal
traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers.
Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z) - Physics-Informed Deep Learning for Traffic State Estimation [3.779860024918729]
Traffic state estimation (TSE) reconstructs the traffic variables (e.g., density) on road segments using partially observed data.
This paper introduces a physics-informed deep learning (PIDL) framework to efficiently conduct high-quality TSE with small amounts of observed data.
arXiv Detail & Related papers (2021-01-17T03:28:32Z) - Robust Optimal Transport with Applications in Generative Modeling and
Domain Adaptation [120.69747175899421]
Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation.
We propose a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications.
Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions.
arXiv Detail & Related papers (2020-10-12T17:13:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.