MVMTnet: A Multi-variate Multi-modal Transformer for Multi-class
Classification of Cardiac Irregularities Using ECG Waveforms and Clinical
Notes
- URL: http://arxiv.org/abs/2302.11021v1
- Date: Tue, 21 Feb 2023 21:38:41 GMT
- Title: MVMTnet: A Multi-variate Multi-modal Transformer for Multi-class
Classification of Cardiac Irregularities Using ECG Waveforms and Clinical
Notes
- Authors: Ankur Samanta, Mark Karlov, Meghna Ravikumar, Christian McIntosh
Clarke, Jayakumar Rajadas, Kaveh Hassani
- Abstract summary: Deep learning can be used to optimize diagnosis and patient monitoring for clinical-based applications.
For cardiovascular disease, one such condition where the rising number of patients increasingly outweighs the availability of medical resources in different parts of the world, a core challenge is the automated classification of various cardiac abnormalities.
The proposed novel multi-modal Transformer architecture would be able to accurately perform this task while demonstrating the cross-domain effectiveness of Transformers.
- Score: 4.648677931378919
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep learning provides an excellent avenue for optimizing diagnosis and
patient monitoring for clinical-based applications, which can critically
enhance the response time to the onset of various conditions. For
cardiovascular disease, one such condition where the rising number of patients
increasingly outweighs the availability of medical resources in different parts
of the world, a core challenge is the automated classification of various
cardiac abnormalities. Existing deep learning approaches have largely been
limited to detecting the existence of an irregularity, as in binary
classification, which has been achieved using networks such as CNNs and
RNN/LSTMs. The next step is to accurately perform multi-class classification
and determine the specific condition(s) from the inherently noisy multi-variate
waveform, which is a difficult task that could benefit from (1) a more powerful
sequential network, and (2) the integration of clinical notes, which provide
valuable semantic and clinical context from human doctors. Recently,
Transformers have emerged as the state-of-the-art architecture for forecasting
and prediction using time-series data, with their multi-headed attention
mechanism, and ability to process whole sequences and learn both long and
short-range dependencies. The proposed novel multi-modal Transformer
architecture would be able to accurately perform this task while demonstrating
the cross-domain effectiveness of Transformers, establishing a method for
incorporating multiple data modalities within a Transformer for classification
tasks, and laying the groundwork for automating real-time patient condition
monitoring in clinical and ER settings.
Related papers
- PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis [6.30440420617113]
We introduce MedTsLLM, a general multimodal large language model (LLM) framework that integrates time series data and rich contextual information in the form of text to analyze physiological signals.
We perform three tasks with clinical relevance: semantic segmentation, boundary detection, and anomaly detection in time series.
Our model outperforms state-of-the-art baselines, including deep learning models, other LLMs, and clinical methods across multiple medical domains.
arXiv Detail & Related papers (2024-08-14T18:57:05Z) - Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records [1.6609516435725236]
We introduce a dynamic embedding and tokenization framework for precise representation of multimodal clinical time series.
Our framework outperformed baseline approaches on the task of predicting the occurrence of nine postoperative complications.
arXiv Detail & Related papers (2024-03-06T19:46:44Z) - Specialty detection in the context of telemedicine in a highly
imbalanced multi-class distribution [3.992328888937568]
The study focuses on handling multiclass and highly imbalanced datasets for Arabic medical questions.
The proposed module is deployed in both synchronous and asynchronous medical consultations.
arXiv Detail & Related papers (2024-02-21T06:39:04Z) - PULASki: Learning inter-rater variability using statistical distances to
improve probabilistic segmentation [36.136619420474766]
We propose the PULASki for biomedical image segmentation that accurately captures variability in expert annotations.
Our approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure.
Our method can also be applied to a wide range of multi-label segmentation tasks and is useful for downstream tasks such as hemodynamic modelling.
arXiv Detail & Related papers (2023-12-25T10:31:22Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - DA-VSR: Domain Adaptable Volumetric Super-Resolution For Medical Images [69.63915773870758]
We present a novel algorithm called domain adaptable super-resolution (DA-VSR) to better bridge the domain inconsistency gap.
DA-VSR uses a unified feature extraction backbone and a series of network heads to improve image quality over different planes.
We demonstrate that DA-VSR significantly improves super-resolution quality across numerous datasets of different domains.
arXiv Detail & Related papers (2022-10-11T03:16:35Z) - TeCNO: Surgical Phase Recognition with Multi-Stage Temporal
Convolutional Networks [43.95869213955351]
We propose a Multi-Stage Temporal Convolutional Network (MS-TCN) that performs hierarchical prediction refinement for surgical phase recognition.
Our method is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos with and without the use of additional surgical tool information.
arXiv Detail & Related papers (2020-03-24T10:12:30Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.