Turning Silver into Gold: Domain Adaptation with Noisy Labels for
Wearable Cardio-Respiratory Fitness Prediction
- URL: http://arxiv.org/abs/2211.10475v1
- Date: Sun, 20 Nov 2022 14:55:48 GMT
- Title: Turning Silver into Gold: Domain Adaptation with Noisy Labels for
Wearable Cardio-Respiratory Fitness Prediction
- Authors: Yu Wu, Dimitris Spathis, Hong Jia, Ignacio Perez-Pozuelo, Tomas I.
Gonzales, Soren Brage, Nicholas Wareham, Cecilia Mascolo
- Abstract summary: We propose UDAMA, a novel model with two key components: Unsupervised Domain Adaptation and Multi-discriminator Adversarial training.
We validate our framework on the challenging task of predicting lab-measured maximal oxygen consumption.
Our experiments show that the proposed framework achieves the best performance of corr = 0.665 $pm$ 0.04, paving the way for accurate fitness estimation at scale.
- Score: 16.26599832125242
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models have shown great promise in various healthcare
applications. However, most models are developed and validated on small-scale
datasets, as collecting high-quality (gold-standard) labels for health
applications is often costly and time-consuming. As a result, these models may
suffer from overfitting and not generalize well to unseen data. At the same
time, an extensive amount of data with imprecise labels (silver-standard) is
starting to be generally available, as collected from inexpensive wearables
like accelerometers and electrocardiography sensors. These currently
underutilized datasets and labels can be leveraged to produce more accurate
clinical models. In this work, we propose UDAMA, a novel model with two key
components: Unsupervised Domain Adaptation and Multi-discriminator Adversarial
training, which leverage noisy data from source domain (the silver-standard
dataset) to improve gold-standard modeling. We validate our framework on the
challenging task of predicting lab-measured maximal oxygen consumption
(VO$_{2}$max), the benchmark metric of cardio-respiratory fitness, using
free-living wearable sensor data from two cohort studies as inputs. Our
experiments show that the proposed framework achieves the best performance of
corr = 0.665 $\pm$ 0.04, paving the way for accurate fitness estimation at
scale.
Related papers
- How Can We Tame the Long-Tail of Chest X-ray Datasets? [0.0]
Chest X-rays (CXRs) are a medical imaging modality that is used to infer a large number of abnormalities.
Few of them are quite commonly observed and are abundantly represented in CXR datasets.
It is challenging for current models to learn independent discriminatory features for labels that are rare but may be of high significance.
arXiv Detail & Related papers (2023-09-08T12:28:40Z) - UDAMA: Unsupervised Domain Adaptation through Multi-discriminator
Adversarial Training with Noisy Labels Improves Cardio-fitness Prediction [16.26599832125242]
We introduce UDAMA, a method with two key components: Unsupervised Domain Adaptation and Multidiscriminator Adversarial Training.
In particular, we showcase the practical potential of UDAMA by applying it to Cardio-respiratory fitness (CRF) prediction.
Our results show promising performance by alleviating distribution shifts in various label shift settings.
arXiv Detail & Related papers (2023-07-31T13:31:53Z) - Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation.
Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels.
We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - DAGAD: Data Augmentation for Graph Anomaly Detection [57.92471847260541]
This paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs.
A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics.
arXiv Detail & Related papers (2022-10-18T11:28:21Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Personalized Step Counting Using Wearable Sensors: A Domain Adapted LSTM
Network Approach [0.0]
Tri-axial accelerometer inside PA monitors can be exploited to improve step count accuracy across devices and individuals.
Open-source raw sensor data was used to construct a long short term memory (LSTM) deep neural network to model step count.
A small amount of subject-specific data was domain adapted to produce personalized models with high individualized step count accuracy.
arXiv Detail & Related papers (2020-12-11T19:52:43Z) - Unsupervised Pre-trained Models from Healthy ADLs Improve Parkinson's
Disease Classification of Gait Patterns [3.5939555573102857]
We show how to extract features relevant to accelerometer gait data for Parkinson's disease classification.
Our pre-trained source model consists of a convolutional autoencoder, and the target classification model is a simple multi-layer perceptron model.
We explore two different pre-trained source models, trained using different activity groups, and analyze the influence the choice of pre-trained model has over the task of Parkinson's disease classification.
arXiv Detail & Related papers (2020-05-06T04:08:19Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - Teacher-Student Domain Adaptation for Biosensor Models [0.0]
We present an approach to domain adaptation, addressing the case where data from the source domain is abundant, labelled data from the target domain is limited or non-existent, and a small amount of paired source-target data is available.
The method is designed for developing deep learning models that detect the presence of medical conditions based on data from consumer-grade portable biosensors.
arXiv Detail & Related papers (2020-03-17T19:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.