Related papers: Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?

Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?

URL: http://arxiv.org/abs/2405.18725v1
Date: Wed, 29 May 2024 03:16:12 GMT
Title: Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?
Authors: Jiajie Li, Bo Gu, Shimin Gong, Zhou Su, Mohsen Guizani,
Abstract summary: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. This article proposes a prediction- and reputation-based truth discovery framework. It can separate low-quality data from high-quality data in sensing tasks.
Score: 45.875832406278214
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. However, ensuring the quality of the sensing data submitted by mobile users (MUs) remains a complex and challenging problem. To address this challenge, an advanced method is required to detect low-quality sensing data and identify malicious MUs that may disrupt the normal operations of an MCS system. Therefore, this article proposes a prediction- and reputation-based truth discovery (PRBTD) framework, which can separate low-quality data from high-quality data in sensing tasks. First, we apply a correlation-focused spatial-temporal transformer network to predict the ground truth of the input sensing data. Then, we extract the sensing errors of the data as features based on the prediction results to calculate the implications among the data. Finally, we design a reputation-based truth discovery (TD) module for identifying low-quality data with their implications. Given sensing data submitted by MUs, PRBTD can eliminate the data with heavy noise and identify malicious MUs with high accuracy. Extensive experimental results demonstrate that PRBTD outperforms the existing methods in terms of identification accuracy and data quality enhancement.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models [1.9001431325800364]
Multimodal foundation models (MFMs) such as OFASys show the potential to unlock analysis of complex data via text prompts alone. Their performance may suffer in the face of text input that differs even slightly from their training distribution. This study demonstrates that prompt instability is a major concern for MFMs, leading to a consistent drop in performance across all modalities.
arXiv Detail & Related papers (2024-08-26T19:26:55Z)
Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring [5.282482641822561]
Differential Privacy (DP) adds mathematically controlled noise to Machine Learning (ML) models. This study presents the Differential Privacy-Hyperdimensional Computing (DP-HD) framework to quantify noise effects on accuracy. Experimental results show DP-HD achieves superior operational efficiency, prediction accuracy, and privacy protection.
arXiv Detail & Related papers (2024-07-09T17:42:26Z)
Physics-Informed Deep Learning and Partial Transfer Learning for Bearing Fault Diagnosis in the Presence of Highly Missing Data [0.0]
This paper presents the PTPAI method, which uses a physics-informed deep learning-based technique to generate synthetic labeled data. It addresses imbalanced class problems and partial-set fault diagnosis hurdles. Experimental outcomes on the CWRU and JNU datasets indicate that the proposed approach effectively addresses these problems.
arXiv Detail & Related papers (2024-06-16T17:36:53Z)
PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin) We propose PUMA, a new data pruning strategy that computes the margin using DeepFool. We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z)
Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation. Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels. We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks. This work proposes a data selection strategy to be applied in the mini-batch training. The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z)
Augmenting Softmax Information for Selective Classification with Out-of-Distribution Data [7.221206118679026]
We show that existing post-hoc methods perform quite differently compared to when evaluated only on OOD detection. We propose a novel method for SCOD, Softmax Information Retaining Combination (SIRC), that augments softmax-based confidence scores with feature-agnostic information. Experiments on a wide variety of ImageNet-scale datasets and convolutional neural network architectures show that SIRC is able to consistently match or outperform the baseline for SCOD.
arXiv Detail & Related papers (2022-07-15T14:39:57Z)
Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference. Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance. In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z)
Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets. We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR. We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z)
Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data. We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z)
Data Profiling for Adversarial Training: On the Ruin of Problematic Data [27.11328449349065]
Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking. We show that these problems share one common cause -- low quality samples in the dataset. We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
arXiv Detail & Related papers (2021-02-15T10:17:24Z)
A Unified Plug-and-Play Framework for Effective Data Denoising and Robust Abstention [4.200576272300216]
We propose a unified filtering framework leveraging underlying data density. Our framework can effectively denoising training data and avoid predicting uncertain test data points.
arXiv Detail & Related papers (2020-09-25T04:18:08Z)
Deep convolutional generative adversarial networks for traffic data imputation encoding time series as images [7.053891669775769]
We have developed a generative adversarial network (GAN) based traffic sensor data imputation framework (TGAN) In this study, we have developed a novel time-dependent encoding method called the Gramian Angular Summation Field (GASF) This study shows that the proposed model can significantly improve the traffic data imputation accuracy in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) compared to state-of-the-art models on the benchmark dataset.
arXiv Detail & Related papers (2020-05-05T19:14:02Z)
On the Role of Dataset Quality and Heterogeneity in Model Confidence [27.657631193015252]
Safety-critical applications require machine learning models that output accurate and calibrated probabilities. Uncalibrated deep networks are known to make over-confident predictions. We study the impact of dataset quality by studying the impact of dataset size and the label noise on the model confidence.
arXiv Detail & Related papers (2020-02-23T05:13:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.