The OxMat dataset: a multimodal resource for the development of AI-driven technologies in maternal and newborn child health
- URL: http://arxiv.org/abs/2404.08024v1
- Date: Thu, 11 Apr 2024 09:52:39 GMT
- Title: The OxMat dataset: a multimodal resource for the development of AI-driven technologies in maternal and newborn child health
- Authors: M. Jaleed Khan, Ioana Duta, Beth Albert, William Cooke, Manu Vatish, Gabriel Davis Jones,
- Abstract summary: This paper introduces the Oxford Maternity (OxMat) dataset, the world's largest curated dataset of cardiotocography (CTGs)
OxMat dataset addresses the critical gap in women's health data by providing over 177,211 unique CTG recordings from 51,036 pregnancies, carefully curated and reviewed since 1991.
The dataset also comprises over 200 antepartum, intrapartum and postpartum clinical variables, ensuring near-complete data for crucial outcomes such as stillbirth and acidaemia.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid advancement of Artificial Intelligence (AI) in healthcare presents a unique opportunity for advancements in obstetric care, particularly through the analysis of cardiotocography (CTG) for fetal monitoring. However, the effectiveness of such technologies depends upon the availability of large, high-quality datasets that are suitable for machine learning. This paper introduces the Oxford Maternity (OxMat) dataset, the world's largest curated dataset of CTGs, featuring raw time series CTG data and extensive clinical data for both mothers and babies, which is ideally placed for machine learning. The OxMat dataset addresses the critical gap in women's health data by providing over 177,211 unique CTG recordings from 51,036 pregnancies, carefully curated and reviewed since 1991. The dataset also comprises over 200 antepartum, intrapartum and postpartum clinical variables, ensuring near-complete data for crucial outcomes such as stillbirth and acidaemia. While this dataset also covers the intrapartum stage, around 94% of the constituent CTGS are antepartum. This allows for a unique focus on the underserved antepartum period, in which early detection of at-risk fetuses can significantly improve health outcomes. Our comprehensive review of existing datasets reveals the limitations of current datasets: primarily, their lack of sufficient volume, detailed clinical data and antepartum data. The OxMat dataset lays a foundation for future AI-driven prenatal care, offering a robust resource for developing and testing algorithms aimed at improving maternal and fetal health outcomes.
Related papers
- Mammo-Clustering:A Weakly Supervised Multi-view Global-Local Context Clustering Network for Detection and Classification in Mammography [13.581151516877238]
We propose a weakly supervised multi-view mammography early screening model for breast cancer based on context clustering.
Our model shows potential in reducing the burden on doctors and increasing the feasibility of breast cancer screening for women in underdeveloped regions.
arXiv Detail & Related papers (2024-09-23T10:17:13Z) - ISLES 2024: The first longitudinal multimodal multi-center real-world dataset in (sub-)acute stroke [2.7919032539697444]
Stroke remains a leading cause of global morbidity and mortality, placing a heavy socioeconomic burden.
To develop machine learning algorithms that can extract meaningful and reproducible models of brain function from stroke images.
Our dataset is the first to offer comprehensive longitudinal stroke data, including acute CT imaging with angiography and perfusion, follow-up MRI at 2-9 days, and acute and longitudinal clinical data up to a three-month outcome.
arXiv Detail & Related papers (2024-08-20T18:59:52Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - A multi-institutional pediatric dataset of clinical radiology MRIs by
the Children's Brain Tumor Network [6.562299138758103]
We provide a large-scale pediatric dataset of 23,101 multi-parametric MRI exams acquired through routine care for 1,526 brain tumor patients.
This includes longitudinal MRIs across various cancer diagnoses, with associated patient-level clinical information, digital pathology slides, as well as tissue genotype and omics data.
arXiv Detail & Related papers (2023-10-02T17:59:56Z) - Unveiling the Unborn: Advancing Fetal Health Classification through Machine Learning [0.0]
This research paper presents a novel machine-learning approach for fetal health classification.
The proposed model achieves an impressive accuracy of 98.31% on a test set.
By incorporating multiple data points, our model offers a more holistic and reliable evaluation.
arXiv Detail & Related papers (2023-09-30T22:02:51Z) - Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually.
Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data.
We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas.
This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - COVIDx-US -- An open-access benchmark dataset of ultrasound imaging data
for AI-driven COVID-19 analytics [116.6248556979572]
COVIDx-US is an open-access benchmark dataset of COVID-19 related ultrasound imaging data.
It consists of 93 lung ultrasound videos and 10,774 processed images of patients infected with SARS-CoV-2 pneumonia, non-SARS-CoV-2 pneumonia, as well as healthy control cases.
arXiv Detail & Related papers (2021-03-18T03:31:33Z) - Hybrid Attention for Automatic Segmentation of Whole Fetal Head in
Prenatal Ultrasound Volumes [52.53375964591765]
We propose the first fully-automated solution to segment the whole fetal head in US volumes.
The segmentation task is firstly formulated as an end-to-end volumetric mapping under an encoder-decoder deep architecture.
We then combine the segmentor with a proposed hybrid attention scheme (HAS) to select discriminative features and suppress the non-informative volumetric features.
arXiv Detail & Related papers (2020-04-28T14:43:05Z) - VerSe: A Vertebrae Labelling and Segmentation Benchmark for
Multi-detector CT Images [121.31355003451152]
Large Scale Vertebrae Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020.
We present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view.
arXiv Detail & Related papers (2020-01-24T21:09:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.