Related papers: Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning

Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning

URL: http://arxiv.org/abs/2509.21943v2
Date: Mon, 29 Sep 2025 07:37:15 GMT
Title: Outlier Detection in Plantar Pressure: Human-Centered Comparison of Statistical Parametric Mapping and Explainable Machine Learning
Authors: Carlo Dindorf, Jonas Dully, Steven Simon, Dennis Perchthaler, Stephan Becker, Hannah Ehmann, Kjell Heitmann, Bernd Stetter, Christian Diers, Michael Fröhlich,
Abstract summary: Statistical Parametric Mapping (SPM) provides interpretable analyses but is sensitive to alignment and its capacity for robust outlier detection remains unclear.<n>This study compares an SPM approach with an explainable machine learning (ML) approach to establish transparent quality-control pipelines for plantar pressure datasets.
Score: 0.6990756918994179
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Plantar pressure mapping is essential in clinical diagnostics and sports science, yet large heterogeneous datasets often contain outliers from technical errors or procedural inconsistencies. Statistical Parametric Mapping (SPM) provides interpretable analyses but is sensitive to alignment and its capacity for robust outlier detection remains unclear. This study compares an SPM approach with an explainable machine learning (ML) approach to establish transparent quality-control pipelines for plantar pressure datasets. Data from multiple centers were annotated by expert consensus and enriched with synthetic anomalies resulting in 798 valid samples and 2000 outliers. We evaluated (i) a non-parametric, registration-dependent SPM approach and (ii) a convolutional neural network (CNN), explained using SHapley Additive exPlanations (SHAP). Performance was assessed via nested cross-validation; explanation quality via a semantic differential survey with domain experts. The ML model reached high accuracy and outperformed SPM, which misclassified clinically meaningful variations and missed true outliers. Experts perceived both SPM and SHAP explanations as clear, useful, and trustworthy, though SPM was assessed less complex. These findings highlight the complementary potential of SPM and explainable ML as approaches for automated outlier detection in plantar pressure data, and underscore the importance of explainability in translating complex model outputs into interpretable insights that can effectively inform decision-making.

Related papers

Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection [105.14032334647932]
Machine-generated texts (MGTs) pose risks such as disinformation and phishing, highlighting the need for reliable detection.<n> Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting.<n>We propose a Markov-informed score calibration strategy that models two relationships of context detection scores that may aid calibration.
arXiv Detail & Related papers (2026-02-08T16:06:12Z)
UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles [42.37986459997699]
We propose an approach for decomposing uncertainty in SHAP values into aleatoric, epistemic, and entanglement components.<n>We validate the method across three real-world use cases with descriptive statistical analyses.<n>This understanding can guide the development of robust decision-making processes and the refinement of models in high-stakes applications.
arXiv Detail & Related papers (2025-08-13T09:20:33Z)
Clinical NLP with Attention-Based Deep Learning for Multi-Disease Prediction [44.0876796031468]
This paper addresses the challenges posed by the unstructured nature and high-dimensional semantic complexity of electronic health record texts.<n>A deep learning method based on attention mechanisms is proposed to achieve unified modeling for information extraction and multi-label disease prediction.
arXiv Detail & Related papers (2025-07-02T07:45:22Z)
Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data [5.591260685112265]
SCORE is a semi-supervised representation learning framework that captures multi-domain disease profiles through patient embeddings.<n>To handle the computational challenges of large-scale data, it introduces a hybrid Expectation-Maximization (EM) and Gaussian Variational Approximation (GVA) algorithm.<n>Our analysis shows that incorporating unlabeled data enhances accuracy and reduces sensitivity to label scarcity.
arXiv Detail & Related papers (2025-05-27T05:20:17Z)
Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [64.74977204942199]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z)
SHIP: A Shapelet-based Approach for Interpretable Patient-Ventilator Asynchrony Detection [3.580340090571342]
Patient-ventilator asynchrony (PVA) is a common and critical issue during mechanical ventilation, affecting up to 85% of patients.<n>We propose a shapelet-based approach SHIP for PVA detection, utilizing shapelets - discriminative subsequences in time-series data.<n>Our method addresses dataset imbalances through shapelet-based data augmentation and constructs a shapelet pool to transform the dataset for more effective classification.
arXiv Detail & Related papers (2025-03-09T11:58:03Z)
SIDE: Surrogate Conditional Data Extraction from Diffusion Models [32.18993348942877]
We present textbfSurrogate condItional Data Extraction (SIDE), a framework that constructs data-driven surrogate conditions to enable targeted extraction from any DPM.<n>We show that SIDE can successfully extract training data from so-called safe unconditional models, outperforming baseline attacks even on conditional models.<n>Our work redefines the threat landscape for DPMs, establishing precise conditioning as a fundamental vulnerability and setting a new, stronger benchmark for model privacy evaluation.
arXiv Detail & Related papers (2024-10-03T13:17:06Z)
Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z)
Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data [62.56890808004615]
We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making. We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.
arXiv Detail & Related papers (2023-12-17T00:42:42Z)
Unsupervised representation learning with recognition-parametrised probabilistic models [12.865596223775649]
We introduce a new approach to probabilistic unsupervised learning based on the recognition-parametrised model ( RPM) Under the key assumption that observations are conditionally independent given latents, the RPM combines parametric prior observation-conditioned latent distributions with non-parametric observationfactors. The RPM provides a powerful framework to discover meaningful latent structure underlying observational data, a function critical to both animal and artificial intelligence.
arXiv Detail & Related papers (2022-09-13T00:33:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.