Related papers: Advancing Radiograph Representation Learning with Masked Record Modeling

Advancing Radiograph Representation Learning with Masked Record Modeling

URL: http://arxiv.org/abs/2301.13155v1
Date: Mon, 30 Jan 2023 18:33:32 GMT
Title: Advancing Radiograph Representation Learning with Masked Record Modeling
Authors: Hong-Yu Zhou, Chenyu Lian, Liansheng Wang, Yizhou Yu
Abstract summary: We formulate the self- and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM) MRM reconstructs masked image patches and masked report tokens following a multi-task scheme to learn knowledge-enhanced semantic representations. Specifically, we find that MRM offers superior performance in label-efficient fine-tuning.
Score: 52.04899592688968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern studies in radiograph representation learning rely on either self-supervision to encode invariant semantics or associated radiology reports to incorporate medical expertise, while the complementarity between them is barely noticed. To explore this, we formulate the self- and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM). In practice, MRM reconstructs masked image patches and masked report tokens following a multi-task scheme to learn knowledge-enhanced semantic representations. With MRM pre-training, we obtain pre-trained models that can be well transferred to various radiography tasks. Specifically, we find that MRM offers superior performance in label-efficient fine-tuning. For instance, MRM achieves 88.5% mean AUC on CheXpert using 1% labeled data, outperforming previous R$^2$L methods with 100% labels. On NIH ChestX-ray, MRM outperforms the best performing counterpart by about 3% under small labeling ratios. Besides, MRM surpasses self- and report-supervised pre-training in identifying the pneumonia type and the pneumothorax area, sometimes by large margins.

Related papers

ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification [57.22053411719822]
ChestX-Reasoner is a radiology diagnosis MLLM designed to leverage process supervision mined directly from clinical reports. Our two-stage training framework combines supervised fine-tuning and reinforcement learning guided by process rewards to better align model reasoning with clinical standards.
arXiv Detail & Related papers (2025-04-29T16:48:23Z)
Comparison of Metadata Representation Models for Knowledge Graph Embeddings [1.8749305679160366]
Hyper-relational Knowledge Graphs (HRKGs) extend traditional KGs beyond binary relations. This study evaluates the effects of different Metadata Representation Models (MRMs) on KG Embedding (KGE) and Link Prediction (LP) models. We propose a framework that effectively reflects the knowledge representations of the three MRMs in latent space.
arXiv Detail & Related papers (2025-03-25T04:46:23Z)
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation [15.257119888131609]
We propose enhanced contrastive learning with Multi-view Longitudinal data to facilitate chest X-ray Report Generation, named MLRG. Specifically, we introduce a multi-view longitudinal contrast learning method that integrates spatial information from current multi-view images and temporal information from longitudinal data. We present a tokenized absence encoding technique to handle missing patient-specific prior knowledge, allowing the model to produce more accurate radiology reports based on available prior knowledge.
arXiv Detail & Related papers (2025-02-27T12:59:04Z)
ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process. We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z)
MRGen: Segmentation Data Engine For Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically significant imaging modalities is challenging due to the scarcity of annotated data. This paper investigates leveraging generative models to synthesize training data, to train segmentation models for underrepresented modalities.
arXiv Detail & Related papers (2024-12-04T16:34:22Z)
MCL: Multi-view Enhanced Contrastive Learning for Chest X-ray Report Generation [15.615477864185522]
We propose a Multi-view enhanced Contrastive Learning method for chest X-ray report generation. Specifically, we first introduce multi-view enhanced contrastive learning for visual representation by maximizing agreements between multi-view radiographs and corresponding report. We construct Multi-view CXR and Two-view CXR datasets from public sources to support research on multi-view report generation.
arXiv Detail & Related papers (2024-11-15T14:38:13Z)
Multi-Tiered Self-Contrastive Learning for Medical Microwave Radiometry (MWR) Breast Cancer Detection [0.25569800973362833]
This study introduces a novel multi-tiered self-contrastive model tailored for the application of microwave radiometry (MWR) breast cancer detection. Our approach encompasses three distinct models: Local-MWR (L-MWR), Regional-MWR (R-MWR), and Global-MWR (G-MWR) These models are cohesively integrated through the Joint-MWR (J-MWR) network, which leverages the self-contrastive data generated at each analytical level to enhance detection capabilities.
arXiv Detail & Related papers (2024-10-06T21:51:02Z)
Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas. This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z)
RRM: Robust Reward Model Training Mitigates Reward Hacking [51.12341734942797]
Reward models (RMs) play a pivotal role in aligning large language models with human preferences. We introduce a causal framework that learns preferences independent of these artifacts. Experiments show that our approach successfully filters out undesirable artifacts, yielding a more robust reward model.
arXiv Detail & Related papers (2024-09-20T01:46:07Z)
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis [1.2903829793534272]
Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions. Efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records. This paper introduces MedPromptX, the first model to integrate multimodal large language models (MLLMs), few-shot prompting (FP) and visual grounding (VG) Results demonstrate the SOTA performance of MedPromptX, achieving an 11% improvement in F1-score compared to the baselines.
arXiv Detail & Related papers (2024-03-22T19:19:51Z)
Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images. We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy. Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z)
Learning to Generalize towards Unseen Domains via a Content-Aware Style Invariant Model for Disease Detection from Chest X-rays [2.2835858158799405]
Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging. Recent studies have demonstrated that CNNs are biased toward styles rather than content. We employ the novel on-the-fly style randomization modules at both image (SRM-IL) and feature (SRM-FL) levels to create rich style perturbed features.
arXiv Detail & Related papers (2023-02-27T17:30:00Z)
From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data. To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data. PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z)
RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images [49.24576562557866]
We propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set compared to other methods.
arXiv Detail & Related papers (2022-11-01T07:41:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.