A highly scalable repository of waveform and vital signs data from
bedside monitoring devices
- URL: http://arxiv.org/abs/2106.03965v1
- Date: Mon, 7 Jun 2021 20:59:58 GMT
- Title: A highly scalable repository of waveform and vital signs data from
bedside monitoring devices
- Authors: Sanjay Malunjkar, Susan Weber, Somalee Datta
- Abstract summary: Machine learning is driving the appetite of the research community for various types of signal data such as patient vitals.
Health care systems are ill suited for massive processing of large volumes of data.
We have developed a solution that siphons off patient vital data on a nightly basis from on-premises bio-medical systems to a cloud storage location as a permanent archive.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The advent of cost effective cloud computing over the past decade and
ever-growing accumulation of high-fidelity clinical data in a modern hospital
setting is leading to new opportunities for translational medicine. Machine
learning is driving the appetite of the research community for various types of
signal data such as patient vitals. Health care systems, however, are ill
suited for massive processing of large volumes of data. In addition, due to the
sheer magnitude of the data being collected, it is not feasible to retain all
of the data in health care systems in perpetuity. This gold mine of information
gets purged periodically thereby losing invaluable future research
opportunities. We have developed a highly scalable solution that: a) siphons
off patient vital data on a nightly basis from on-premises bio-medical systems
to a cloud storage location as a permanent archive, b) reconstructs the
database in the cloud, c) generates waveforms, alarms and numeric data in a
research-ready format, and d) uploads the processed data to a storage location
in the cloud ready for research.
The data is de-identified and catalogued such that it can be joined with
Electronic Medical Records (EMR) and other ancillary data types such as
electroencephalogram (EEG), radiology, video monitoring etc. This technique
eliminates the research burden from health care systems. This highly scalable
solution is used to process high density patient monitoring data aggregated by
the Philips Patient Information Center iX (PIC iX) hospital surveillance system
for archival storage in the Philips Data Warehouse Connect enterprise-level
database. The solution is part of a broader platform that supports a secure
high performance clinical data science platform.
Related papers
- DAMMI:Daily Activities in a Psychologically Annotated Multi-Modal IoT dataset [10.771838327042609]
The DAMMI dataset is designed to support researchers in the field.
It includes daily activity data of an elderly individual collected via home-installed sensors, smartphone data, and a wristband over 146 days.
The data collection spans significant events such as the COVID-19 pandemic, New Year's holidays, and the religious month of Ramadan.
arXiv Detail & Related papers (2024-10-05T13:26:54Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Building Flexible, Scalable, and Machine Learning-ready Multimodal
Oncology Datasets [17.774341783844026]
This work proposes Multimodal Integration of Oncology Data System (MINDS)
MINDS is a flexible, scalable, and cost-effective metadata framework for efficiently fusing disparate data from public sources.
By harmonizing multimodal data, MINDS aims to potentially empower researchers with greater analytical ability.
arXiv Detail & Related papers (2023-09-30T15:44:39Z) - Disease Insight through Digital Biomarkers Developed by Remotely
Collected Wearables and Smartphone Data [3.9411499615751113]
RADAR-base is a modern remote data collection platform built around Confluent's Apache Kafka.
It provides support for study design and set-up, active (eg PROMs) and passive (eg. phone sensors, wearable devices and IoT) remote data collection capabilities.
The platform has successfully collected longitudinal data for various cohorts in a number of disease areas including Multiple Sclerosis, Depression, Epilepsy, ADHD, Alzheimer, Autism and Lung diseases.
arXiv Detail & Related papers (2023-08-03T22:44:48Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - A Methodology for a Scalable, Collaborative, and Resource-Efficient
Platform to Facilitate Healthcare AI Research [0.0]
We present a system to accelerate data acquisition, dataset development and analysis, and AI model development.
This system can ingest 15,000 patient records per hour, where each record represents thousands of measurements, text notes, and high resolution data.
arXiv Detail & Related papers (2021-12-13T18:39:10Z) - Label scarcity in biomedicine: Data-rich latent factor discovery
enhances phenotype prediction [102.23901690661916]
Low-dimensional embedding spaces can be derived from the UK Biobank population dataset to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics.
Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.
arXiv Detail & Related papers (2021-10-12T16:25:50Z) - Synthetic Data: Opening the data floodgates to enable faster, more
directed development of machine learning methods [96.92041573661407]
Many ground-breaking advancements in machine learning can be attributed to the availability of a large volume of rich data.
Many large-scale datasets are highly sensitive, such as healthcare data, and are not widely available to the machine learning community.
Generating synthetic data with privacy guarantees provides one such solution.
arXiv Detail & Related papers (2020-12-08T17:26:10Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Data Mining with Big Data in Intrusion Detection Systems: A Systematic
Literature Review [68.15472610671748]
Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation.
The rapid rate and volume of data creation has begun to pose significant challenges for data management and security.
The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance.
arXiv Detail & Related papers (2020-05-23T20:57:12Z) - A new paradigm for accelerating clinical data science at Stanford
Medicine [1.3814679165245243]
Stanford Medicine is building a new data platform for our academic research community to do better clinical data science.
Hospitals have a large amount of patient data and researchers have demonstrated the ability to reuse that data and AI approaches.
We are establishing a new secure Big Data platform that aims to reduce time to access and analyze data.
arXiv Detail & Related papers (2020-03-17T16:21:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.