TGV: Tabular Data-Guided Learning of Visual Cardiac Representations
- URL: http://arxiv.org/abs/2503.14998v1
- Date: Wed, 19 Mar 2025 08:49:55 GMT
- Title: TGV: Tabular Data-Guided Learning of Visual Cardiac Representations
- Authors: Marta Hasny, Maxime Di Folco, Keno Bressem, Julia Schnabel,
- Abstract summary: In medical imaging, we often seek to compare entire patients with different phenotypes rather than just multiple augmentations of one scan.<n>We propose harnessing clinically relevant tabular data to identify distinct patient phenotypes and form more meaningful pairs.<n>We demonstrate its strength using short-axis cardiac MR images and clinical attributes from the UK Biobank.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Contrastive learning methods in computer vision typically rely on different views of the same image to form pairs. However, in medical imaging, we often seek to compare entire patients with different phenotypes rather than just multiple augmentations of one scan. We propose harnessing clinically relevant tabular data to identify distinct patient phenotypes and form more meaningful pairs in a contrastive learning framework. Our method uses tabular attributes to guide the training of visual representations, without requiring a joint embedding space. We demonstrate its strength using short-axis cardiac MR images and clinical attributes from the UK Biobank, where tabular data helps to more effectively distinguish between patient subgroups. Evaluation on downstream tasks, including fine-tuning and zero-shot prediction of cardiovascular artery diseases and cardiac phenotypes, shows that incorporating tabular data yields stronger visual representations than conventional methods that rely solely on image augmentations or combined image-tabular embeddings. Furthermore, we demonstrate that image encoders trained with tabular guidance are capable of embedding demographic information in their representations, allowing them to use insights from tabular data for unimodal predictions, making them well-suited to real-world medical settings where extensive clinical annotations may not be routinely available at inference time. The code will be available on GitHub.
Related papers
- Barttender: An approachable & interpretable way to compare medical imaging and non-imaging data [0.13406576408866772]
Barttender is an interpretable framework that uses deep learning for the comparison of the utility of imaging versus non-imaging data for tasks like disease prediction.
Our framework allows researchers to evaluate differences in utility through performance measures, as well as local (sample-level) and global (population-level) explanations.
arXiv Detail & Related papers (2024-11-19T18:22:25Z) - Predicting Stroke through Retinal Graphs and Multimodal Self-supervised Learning [0.46835339362676565]
Early identification of stroke is crucial for intervention, requiring reliable models.
We proposed an efficient retinal image representation together with clinical information to capture a comprehensive overview of cardiovascular health.
arXiv Detail & Related papers (2024-11-08T14:40:56Z) - Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [83.30609465252441]
We propose a textbfCountertextbfFactual textbfExplanations-based framework (CoFE) for radiology report generation.
Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking what if'' scenarios.
Experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports.
arXiv Detail & Related papers (2024-07-19T17:24:25Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Multimodal brain age estimation using interpretable adaptive
population-graph learning [58.99653132076496]
We propose a framework that learns a population graph structure optimized for the downstream task.
An attention mechanism assigns weights to a set of imaging and non-imaging features.
By visualizing the attention weights that were the most important for the graph construction, we increase the interpretability of the graph.
arXiv Detail & Related papers (2023-07-10T15:35:31Z) - Best of Both Worlds: Multimodal Contrastive Learning with Tabular and
Imaging Data [7.49320945341034]
We propose the first self-supervised contrastive learning framework to train unimodal encoders.
Our solution combines SimCLR and SCARF, two leading contrastive learning strategies.
We show the generalizability of our approach to natural images using the DVM car advertisement dataset.
arXiv Detail & Related papers (2023-03-24T15:44:42Z) - Lesion-based Contrastive Learning for Diabetic Retinopathy Grading from
Fundus Images [2.498907460918493]
We propose a self-supervised framework, namely lesion-based contrastive learning for automated diabetic retinopathy grading.
Our proposed framework performs outstandingly on DR grading in terms of both linear evaluation and transfer capacity evaluation.
arXiv Detail & Related papers (2021-07-17T16:30:30Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Context Matters: Graph-based Self-supervised Representation Learning for
Medical Images [21.23065972218941]
We introduce a novel approach with two levels of self-supervised representation learning objectives.
We use graph neural networks to incorporate the relationship between different anatomical regions.
Our model can identify clinically relevant regions in the images.
arXiv Detail & Related papers (2020-12-11T16:26:07Z) - Dynamic Graph Correlation Learning for Disease Diagnosis with Incomplete
Labels [66.57101219176275]
Disease diagnosis on chest X-ray images is a challenging multi-label classification task.
We propose a Disease Diagnosis Graph Convolutional Network (DD-GCN) that presents a novel view of investigating the inter-dependency among different diseases.
Our method is the first to build a graph over the feature maps with a dynamic adjacency matrix for correlation learning.
arXiv Detail & Related papers (2020-02-26T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.