In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image
Domains
- URL: http://arxiv.org/abs/2111.15124v1
- Date: Tue, 30 Nov 2021 04:56:16 GMT
- Title: In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image
Domains
- Authors: Ting Cao, Mohammad Ali Armin, Simon Denman, Lars Petersson, David
Ahmedt-Aristizabal
- Abstract summary: In-bed human posture estimation provides important health-related metrics with potential value in medical condition assessments.
We propose a multi-modal conditional variational autoencoder (MC-VAE) capable of reconstructing features from missing modalities seen during training.
We demonstrate that body positions can be effectively recognized from the available modality, achieving on par results with baseline models.
- Score: 22.92165116962952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical applications have benefited from the rapid advancement in computer
vision. For patient monitoring in particular, in-bed human posture estimation
provides important health-related metrics with potential value in medical
condition assessments. Despite great progress in this domain, it remains a
challenging task due to substantial ambiguity during occlusions, and the lack
of large corpora of manually labeled data for model training, particularly with
domains such as thermal infrared imaging which are privacy-preserving, and thus
of great interest. Motivated by the effectiveness of self-supervised methods in
learning features directly from data, we propose a multi-modal conditional
variational autoencoder (MC-VAE) capable of reconstructing features from
missing modalities seen during training. This approach is used with HRNet to
enable single modality inference for in-bed pose estimation. Through extensive
evaluations, we demonstrate that body positions can be effectively recognized
from the available modality, achieving on par results with baseline models that
are highly dependent on having access to multiple modes at inference time. The
proposed framework supports future research towards self-supervised learning
that generates a robust model from a single source, and expects it to
generalize over many unknown distributions in clinical environments.
Related papers
- Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Enhancing Apparent Personality Trait Analysis with Cross-Modal Embeddings [0.5461938536945723]
We present a multimodal deep neural network with a Siamese extension for apparent personality trait prediction trained on short video recordings.
Due to the highly centralized target distribution of the analyzed dataset, the changes in the third digit are relevant.
arXiv Detail & Related papers (2024-05-06T20:51:28Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - On the Out of Distribution Robustness of Foundation Models in Medical
Image Segmentation [47.95611203419802]
Foundations for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach.
We compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset.
We further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution data.
arXiv Detail & Related papers (2023-11-18T14:52:10Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Safe AI for health and beyond -- Monitoring to transform a health
service [51.8524501805308]
We will assess the infrastructure required to monitor the outputs of a machine learning algorithm.
We will present two scenarios with examples of monitoring and updates of models.
arXiv Detail & Related papers (2023-03-02T17:27:45Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - Self-Supervised Graph Learning with Hyperbolic Embedding for Temporal
Health Event Prediction [13.24834156675212]
We propose a hyperbolic embedding method with information flow to pre-train medical code representations in a hierarchical structure.
We incorporate these pre-trained representations into a graph neural network to detect disease complications.
We present a new hierarchy-enhanced historical prediction proxy task in our self-supervised learning framework to fully utilize EHR data.
arXiv Detail & Related papers (2021-06-09T00:42:44Z) - Concept-based model explanations for Electronic Health Records [1.6837409766909865]
Testing with Concept Activation Vectors (TCAV) has recently been introduced as a way of providing human-understandable explanations.
We propose an extension of the method to time series data to enable an application of TCAV to sequential predictions in the EHR.
arXiv Detail & Related papers (2020-12-03T22:18:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.