Efficient Multi-View Fusion and Flexible Adaptation to View Missing in Cardiovascular System Signals
- URL: http://arxiv.org/abs/2406.08930v1
- Date: Thu, 13 Jun 2024 08:58:59 GMT
- Title: Efficient Multi-View Fusion and Flexible Adaptation to View Missing in Cardiovascular System Signals
- Authors: Qihan Hu, Daomiao Wang, Hong Wu, Jian Liu, Cuiwei Yang,
- Abstract summary: Deep learning has facilitated automatic multi-view fusion (MVF) about the cardiovascular system (CVS) signals.
MVF model architecture often amalgamates CVS signals from the same temporal step but different views into a unified representation.
We introduce prompt techniques to aid pretrained MVF models in flexibly adapting to various missing-view scenarios.
- Score: 4.519437028632205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The progression of deep learning and the widespread adoption of sensors have facilitated automatic multi-view fusion (MVF) about the cardiovascular system (CVS) signals. However, prevalent MVF model architecture often amalgamates CVS signals from the same temporal step but different views into a unified representation, disregarding the asynchronous nature of cardiovascular events and the inherent heterogeneity across views, leading to catastrophic view confusion. Efficient training strategies specifically tailored for MVF models to attain comprehensive representations need simultaneous consideration. Crucially, real-world data frequently arrives with incomplete views, an aspect rarely noticed by researchers. Thus, the View-Centric Transformer (VCT) and Multitask Masked Autoencoder (M2AE) are specifically designed to emphasize the centrality of each view and harness unlabeled data to achieve superior fused representations. Additionally, we systematically define the missing-view problem for the first time and introduce prompt techniques to aid pretrained MVF models in flexibly adapting to various missing-view scenarios. Rigorous experiments involving atrial fibrillation detection, blood pressure estimation, and sleep staging-typical health monitoring tasks-demonstrate the remarkable advantage of our method in MVF compared to prevailing methodologies. Notably, the prompt technique requires finetuning less than 3% of the entire model's data, substantially fortifying the model's resilience to view missing while circumventing the need for complete retraining. The results demonstrate the effectiveness of our approaches, highlighting their potential for practical applications in cardiovascular health monitoring. Codes and models are released at URL.
Related papers
- Generalized Robust Fundus Photography-based Vision Loss Estimation for High Myopia [6.193135671460362]
We propose a novel, parameter-efficient framework to enhance the generalized robustness of VF estimation.
Our method significantly outperforms existing approaches in RMSE, MAE and coefficient correlation for both internal and external validation.
arXiv Detail & Related papers (2024-07-04T07:39:19Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer [0.257133335028485]
We propose an innovative multi-view network based on transformers to address challenges in mammographic image classification.
Our approach introduces a novel shifted window-based dynamic attention block, facilitating the effective integration of multi-view information.
arXiv Detail & Related papers (2024-02-26T04:41:04Z) - GEMTrans: A General, Echocardiography-based, Multi-Level Transformer
Framework for Cardiovascular Diagnosis [14.737295160286939]
Vision-based machine learning (ML) methods have gained popularity to act as secondary layers of verification.
We propose a General, Echo-based, Multi-Level Transformer (GEMTrans) framework that provides explainability.
We show the flexibility of our framework by considering two critical tasks including ejection fraction (EF) and aortic stenosis (AS) severity detection.
arXiv Detail & Related papers (2023-08-25T07:30:18Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Progressive Multi-view Human Mesh Recovery with Self-Supervision [68.60019434498703]
Existing solutions typically suffer from poor generalization performance to new settings.
We propose a novel simulation-based training pipeline for multi-view human mesh recovery.
arXiv Detail & Related papers (2022-12-10T06:28:29Z) - Fast Unsupervised Brain Anomaly Detection and Segmentation with
Diffusion Models [1.6352599467675781]
We propose a method based on diffusion models to detect and segment anomalies in brain imaging.
Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data.
arXiv Detail & Related papers (2022-06-07T17:30:43Z) - About Explicit Variance Minimization: Training Neural Networks for
Medical Imaging With Limited Data Annotations [2.3204178451683264]
Variance Aware Training (VAT) method exploits this property by introducing the variance error into the model loss function.
We validate VAT on three medical imaging datasets from diverse domains and various learning objectives.
arXiv Detail & Related papers (2021-05-28T21:34:04Z) - Manifolds for Unsupervised Visual Anomaly Detection [79.22051549519989]
Unsupervised learning methods that don't necessarily encounter anomalies in training would be immensely useful.
We develop a novel hyperspherical Variational Auto-Encoder (VAE) via stereographic projections with a gyroplane layer.
We present state-of-the-art results on visual anomaly benchmarks in precision manufacturing and inspection, demonstrating real-world utility in industrial AI scenarios.
arXiv Detail & Related papers (2020-06-19T20:41:58Z) - Generative Partial Multi-View Clustering [133.36721417531734]
We propose a generative partial multi-view clustering model, named as GP-MVC, to address the incomplete multi-view problem.
First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the consistent cluster structure across multiple views.
Second, view-specific generative adversarial networks are developed to generate the missing data of one view conditioning on the shared representation given by other views.
arXiv Detail & Related papers (2020-03-29T17:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.