Quantifying Uncertainty in Machine Learning-Based Pervasive Systems: Application to Human Activity Recognition
- URL: http://arxiv.org/abs/2512.09775v1
- Date: Wed, 10 Dec 2025 15:56:05 GMT
- Title: Quantifying Uncertainty in Machine Learning-Based Pervasive Systems: Application to Human Activity Recognition
- Authors: Vladimir Balditsyn, Philippe Lalanda, German Vega, Stéphanie Chollet,
- Abstract summary: We propose to quantify uncertainty in machine learning-based systems.<n>We propose to adapt and jointly utilize a set of selected techniques to evaluate the relevance of model predictions at runtime.<n>The results presented demonstrate the relevance of the approach, and we discuss in detail the assistance provided to domain experts.
- Score: 0.2740273306918099
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent convergence of pervasive computing and machine learning has given rise to numerous services, impacting almost all areas of economic and social activity. However, the use of AI techniques precludes certain standard software development practices, which emphasize rigorous testing to ensure the elimination of all bugs and adherence to well-defined specifications. ML models are trained on numerous high-dimensional examples rather than being manually coded. Consequently, the boundaries of their operating range are uncertain, and they cannot guarantee absolute error-free performance. In this paper, we propose to quantify uncertainty in ML-based systems. To achieve this, we propose to adapt and jointly utilize a set of selected techniques to evaluate the relevance of model predictions at runtime. We apply and evaluate these proposals in the highly heterogeneous and evolving domain of Human Activity Recognition (HAR). The results presented demonstrate the relevance of the approach, and we discuss in detail the assistance provided to domain experts.
Related papers
- Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - No Free Delivery Service: Epistemic limits of passive data collection in complex social systems [11.990018360916553]
I will show that for widely considered inference settings in complex social systems the train-test paradigm does not only lack a justification but is indeed invalid for any risk estimator.
I am illustrating these results via the widely used MovieLens benchmark and conclude by discussing the implications of these results for AI in social systems.
arXiv Detail & Related papers (2024-11-20T19:01:03Z) - Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.<n>It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch [19.03141646688652]
We use the theory of mind, i.e., the human user's beliefs about the AI agent, as a basis to develop a formal explanatory framework.
We propose a new interactive algorithm that uses the specified reward to infer potential user expectations.
arXiv Detail & Related papers (2024-04-12T19:43:37Z) - Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs [44.890946409769924]
Neural Operators (NOs) have emerged as particularly promising quantification.
We show that ensembling several NOs can identify high-error regions and provide good uncertainty estimates.
We then introduce Operator-ProbConserv, a method that uses these well-calibrated UQ estimates within the ProbConserv framework to update the model.
arXiv Detail & Related papers (2024-03-15T19:21:27Z) - Weak Supervision Performance Evaluation via Partial Identification [46.73061437177238]
Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels.
We present a novel method to address this challenge by framing model evaluation as a partial identification problem.
Our approach derives reliable bounds on key metrics without requiring labeled data, overcoming core limitations in current weak supervision evaluation techniques.
arXiv Detail & Related papers (2023-12-07T07:15:11Z) - Truthful Meta-Explanations for Local Interpretability of Machine
Learning Models [10.342433824178825]
We present a local meta-explanation technique which builds on top of the truthfulness metric, which is a faithfulness-based metric.
We demonstrate the effectiveness of both the technique and the metric by concretely defining all the concepts and through experimentation.
arXiv Detail & Related papers (2022-12-07T08:32:04Z) - Evaluating Machine Unlearning via Epistemic Uncertainty [78.27542864367821]
This work presents an evaluation of Machine Unlearning algorithms based on uncertainty.
This is the first definition of a general evaluation of our best knowledge.
arXiv Detail & Related papers (2022-08-23T09:37:31Z) - Uncertainty-Driven Action Quality Assessment [11.958132175629368]
We propose a novel probabilistic model, named Uncertainty-Driven AQA (UD-AQA), to capture the diversity among multiple judge scores.<n>We generate the estimation of uncertainty for each prediction, which is employed to re-weight AQA regression loss.<n>Our proposed method achieves competitive results on three benchmarks including the Olympic events MTL-AQA and FineDiving, and the surgical skill JIGSAWS datasets.
arXiv Detail & Related papers (2022-07-29T07:21:15Z) - BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen
Neural Networks [50.15201777970128]
We propose BayesCap that learns a Bayesian identity mapping for the frozen model, allowing uncertainty estimation.
BayesCap is a memory-efficient method that can be trained on a small fraction of the original dataset.
We show the efficacy of our method on a wide variety of tasks with a diverse set of architectures.
arXiv Detail & Related papers (2022-07-14T12:50:09Z) - Logic Constraints to Feature Importances [17.234442722611803]
"Black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc.
Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness.
The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting.
arXiv Detail & Related papers (2021-10-13T09:28:38Z) - Multi Agent System for Machine Learning Under Uncertainty in Cyber
Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing.
Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it.
In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z) - Uncertainty-aware Remaining Useful Life predictor [57.74855412811814]
Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate.
In this work, we consider Deep Gaussian Processes (DGPs) as possible solutions to the aforementioned limitations.
The performance of the algorithms is evaluated on the N-CMAPSS dataset from NASA for aircraft engines.
arXiv Detail & Related papers (2021-04-08T08:50:44Z) - An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear
Analysis [68.8204255655161]
We show that uncertainty measures based on Monte-Carlo dropout in the context of a human-in-the-loop system increase the system's transparency and performance.
A simulation study demonstrates that the uncertainty-based human-in-the-loop system increases performance for different levels of human involvement.
arXiv Detail & Related papers (2020-07-14T15:47:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.