Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
- URL: http://arxiv.org/abs/2405.18725v1
- Date: Wed, 29 May 2024 03:16:12 GMT
- Title: Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
- Authors: Jiajie Li, Bo Gu, Shimin Gong, Zhou Su, Mohsen Guizani,
- Abstract summary: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains.
This article proposes a prediction- and reputation-based truth discovery framework.
It can separate low-quality data from high-quality data in sensing tasks.
- Score: 45.875832406278214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. However, ensuring the quality of the sensing data submitted by mobile users (MUs) remains a complex and challenging problem. To address this challenge, an advanced method is required to detect low-quality sensing data and identify malicious MUs that may disrupt the normal operations of an MCS system. Therefore, this article proposes a prediction- and reputation-based truth discovery (PRBTD) framework, which can separate low-quality data from high-quality data in sensing tasks. First, we apply a correlation-focused spatial-temporal transformer network to predict the ground truth of the input sensing data. Then, we extract the sensing errors of the data as features based on the prediction results to calculate the implications among the data. Finally, we design a reputation-based truth discovery (TD) module for identifying low-quality data with their implications. Given sensing data submitted by MUs, PRBTD can eliminate the data with heavy noise and identify malicious MUs with high accuracy. Extensive experimental results demonstrate that PRBTD outperforms the existing methods in terms of identification accuracy and data quality enhancement.
Related papers
- Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models [1.9001431325800364]
Multimodal foundation models (MFMs) such as OFASys show the potential to unlock analysis of complex data via text prompts alone.
Their performance may suffer in the face of text input that differs even slightly from their training distribution.
This study demonstrates that prompt instability is a major concern for MFMs, leading to a consistent drop in performance across all modalities.
arXiv Detail & Related papers (2024-08-26T19:26:55Z) - Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring [5.282482641822561]
Differential Privacy (DP) adds mathematically controlled noise to Machine Learning (ML) models.
This study presents the Differential Privacy-Hyperdimensional Computing (DP-HD) framework to quantify noise effects on accuracy.
Experimental results show DP-HD achieves superior operational efficiency, prediction accuracy, and privacy protection.
arXiv Detail & Related papers (2024-07-09T17:42:26Z) - Physics-Informed Deep Learning and Partial Transfer Learning for Bearing Fault Diagnosis in the Presence of Highly Missing Data [0.0]
This paper presents the PTPAI method, which uses a physics-informed deep learning-based technique to generate synthetic labeled data.
It addresses imbalanced class problems and partial-set fault diagnosis hurdles.
Experimental outcomes on the CWRU and JNU datasets indicate that the proposed approach effectively addresses these problems.
arXiv Detail & Related papers (2024-06-16T17:36:53Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets.
We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR.
We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z) - Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data.
We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z) - Data Profiling for Adversarial Training: On the Ruin of Problematic Data [27.11328449349065]
Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking.
We show that these problems share one common cause -- low quality samples in the dataset.
We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
arXiv Detail & Related papers (2021-02-15T10:17:24Z) - Deep convolutional generative adversarial networks for traffic data
imputation encoding time series as images [7.053891669775769]
We have developed a generative adversarial network (GAN) based traffic sensor data imputation framework (TGAN)
In this study, we have developed a novel time-dependent encoding method called the Gramian Angular Summation Field (GASF)
This study shows that the proposed model can significantly improve the traffic data imputation accuracy in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) compared to state-of-the-art models on the benchmark dataset.
arXiv Detail & Related papers (2020-05-05T19:14:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.