Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
- URL: http://arxiv.org/abs/2405.18725v1
- Date: Wed, 29 May 2024 03:16:12 GMT
- Title: Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
- Authors: Jiajie Li, Bo Gu, Shimin Gong, Zhou Su, Mohsen Guizani,
- Abstract summary: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains.
This article proposes a prediction- and reputation-based truth discovery framework.
It can separate low-quality data from high-quality data in sensing tasks.
- Score: 45.875832406278214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. However, ensuring the quality of the sensing data submitted by mobile users (MUs) remains a complex and challenging problem. To address this challenge, an advanced method is required to detect low-quality sensing data and identify malicious MUs that may disrupt the normal operations of an MCS system. Therefore, this article proposes a prediction- and reputation-based truth discovery (PRBTD) framework, which can separate low-quality data from high-quality data in sensing tasks. First, we apply a correlation-focused spatial-temporal transformer network to predict the ground truth of the input sensing data. Then, we extract the sensing errors of the data as features based on the prediction results to calculate the implications among the data. Finally, we design a reputation-based truth discovery (TD) module for identifying low-quality data with their implications. Given sensing data submitted by MUs, PRBTD can eliminate the data with heavy noise and identify malicious MUs with high accuracy. Extensive experimental results demonstrate that PRBTD outperforms the existing methods in terms of identification accuracy and data quality enhancement.
Related papers
- Physics-Informed Deep Learning and Partial Transfer Learning for Bearing Fault Diagnosis in the Presence of Highly Missing Data [0.0]
This paper presents the PTPAI method, which uses a physics-informed deep learning-based technique to generate synthetic labeled data.
It addresses imbalanced class problems and partial-set fault diagnosis hurdles.
Experimental outcomes on the CWRU and JNU datasets indicate that the proposed approach effectively addresses these problems.
arXiv Detail & Related papers (2024-06-16T17:36:53Z) - Representation Learning for Wearable-Based Applications in the Case of
Missing Data [20.37256375888501]
multimodal sensor data in real-world environments is still challenging due to low data quality and limited data annotations.
We investigate representation learning for imputing missing wearable data and compare it with state-of-the-art statistical approaches.
Our study provides insights for the design and development of masking-based self-supervised learning tasks.
arXiv Detail & Related papers (2024-01-08T08:21:37Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - Augmenting Softmax Information for Selective Classification with
Out-of-Distribution Data [7.221206118679026]
We show that existing post-hoc methods perform quite differently compared to when evaluated only on OOD detection.
We propose a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information.
Experiments on a wide variety of ImageNet-scale datasets and convolutional neural network architectures show that SIRC is able to consistently match or outperform the baseline for SCOD.
arXiv Detail & Related papers (2022-07-15T14:39:57Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets.
We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR.
We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z) - Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data.
We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z) - Data Profiling for Adversarial Training: On the Ruin of Problematic Data [27.11328449349065]
Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking.
We show that these problems share one common cause -- low quality samples in the dataset.
We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
arXiv Detail & Related papers (2021-02-15T10:17:24Z) - Deep convolutional generative adversarial networks for traffic data
imputation encoding time series as images [7.053891669775769]
We have developed a generative adversarial network (GAN) based traffic sensor data imputation framework (TGAN)
In this study, we have developed a novel time-dependent encoding method called the Gramian Angular Summation Field (GASF)
This study shows that the proposed model can significantly improve the traffic data imputation accuracy in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) compared to state-of-the-art models on the benchmark dataset.
arXiv Detail & Related papers (2020-05-05T19:14:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.