A Multimodal Cross-View Model for Predicting Postoperative Neck Pain in Cervical Spondylosis Patients
- URL: http://arxiv.org/abs/2509.02256v1
- Date: Tue, 02 Sep 2025 12:33:43 GMT
- Title: A Multimodal Cross-View Model for Predicting Postoperative Neck Pain in Cervical Spondylosis Patients
- Authors: Jingyang Shan, Qishuai Yu, Jiacen Liu, Shaolin Zhang, Wen Shen, Yanxiao Zhao, Tianyi Wang, Xiaolin Qin, Yiheng Yin,
- Abstract summary: Neck pain is the primary symptom of cervical spondylosis, yet its underlying mechanisms remain unclear.<n>This paper proposes an Adaptive Bidirectional Pyramid Difference Convolution module that facilitates multimodal integration.<n> Experiments on the MMCSD dataset demonstrate that the proposed model achieves superior prediction accuracy of postoperative neck pain recovery.
- Score: 9.915439327075141
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neck pain is the primary symptom of cervical spondylosis, yet its underlying mechanisms remain unclear, leading to uncertain treatment outcomes. To address the challenges of multimodal feature fusion caused by imaging differences and spatial mismatches, this paper proposes an Adaptive Bidirectional Pyramid Difference Convolution (ABPDC) module that facilitates multimodal integration by exploiting the advantages of difference convolution in texture extraction and grayscale invariance, and a Feature Pyramid Registration Auxiliary Network (FPRAN) to mitigate structural misalignment. Experiments on the MMCSD dataset demonstrate that the proposed model achieves superior prediction accuracy of postoperative neck pain recovery compared with existing methods, and ablation studies further confirm its effectiveness.
Related papers
- Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI [5.332404648315838]
We study ischemic stroke lesion segmentation using multimodal diffusion MRI from the ISLES 2022 dataset.<n>Several state-of-the-art convolutional and transformer-based architectures, including U-Net variants, Swin-UNet, and TransUNet, are benchmarked.<n>Results show that transformer-based models outperform convolutional baselines, and the proposed dual-encoder TransUNet achieves the best performance, reaching a Dice score of 85.4% on the test set.
arXiv Detail & Related papers (2025-12-23T15:24:31Z) - Collaborative Attention and Consistent-Guided Fusion of MRI and PET for Alzheimer's Disease Diagnosis [12.33741976057116]
Alzheimer's disease (AD) is the most prevalent form of dementia, and its early diagnosis is essential for slowing disease progression.<n>Recent studies on multimodal neuroimaging fusion using MRI and PET have achieved promising results.<n>We propose a Collaborative Attention and Consistent-Guided Fusion framework for MRI and PET based AD diagnosis.
arXiv Detail & Related papers (2025-11-04T03:42:07Z) - impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z) - UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model [53.34835793648352]
We propose UniSegDiff, a novel diffusion model framework for lesion segmentation.<n>UniSegDiff addresses lesion segmentation in a unified manner across multiple modalities and organs.<n> Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2025-07-24T12:33:10Z) - Towards Scalable and Robust White Matter Lesion Localization via Multimodal Deep Learning [2.0749231618270803]
White matter hyperintensities (WMH) are radiological markers of small vessel disease and neurodegeneration, whose accurate segmentation and localization are crucial for diagnosis and monitoring.<n>We propose a deep learning framework for WM lesion segmentation and localization that operates directly in native space using single- and multi-modal MRI inputs.<n>Our findings highlight the utility of multimodal fusion for accurate and robust WMH analysis, and the potential of joint modeling for integrated predictions.
arXiv Detail & Related papers (2025-06-27T09:39:26Z) - GEMeX-RMCoT: An Enhanced Med-VQA Dataset for Region-Aware Multimodal Chain-of-Thought Reasoning [60.03671205298294]
Medical visual question answering aims to support clinical decision-making by enabling models to answer natural language questions based on medical images.<n>Current methods still suffer from limited answer reliability and poor interpretability.<n>This work first proposes a Region-Aware Multimodal Chain-of-Thought dataset, in which the process of producing an answer is preceded by a sequence of intermediate reasoning steps.
arXiv Detail & Related papers (2025-06-22T08:09:58Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading [7.188153974946432]
Glaucoma is one of the leading causes of vision impairment.
It remains challenging to extract reliable features due to the high similarity of medical images and the unbalanced multi-modal data distribution.
We propose a novel framework, namely ETSCL, which consists of a contrastive feature extraction stage and a decision-level fusion stage.
arXiv Detail & Related papers (2024-07-19T11:57:56Z) - Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs [35.46541584018842]
Unsupervised Anomaly Detection (UAD) aims to identify any anomaly as an outlier from a healthy training distribution.<n>generative models are used to learn the reconstruction of healthy brain anatomy for a given input image.<n>We propose conditioning the denoising process of diffusion models with additional information derived from a latent representation of the input image.
arXiv Detail & Related papers (2023-12-07T11:03:42Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images [39.94162291765236]
We present a weakly supervised method to generate a healthy version of a diseased image and then use it to obtain a pixel-wise anomaly map.
We employ a diffusion model trained on healthy samples and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising Implicit Model (DDIM) at each step of the sampling process.
arXiv Detail & Related papers (2023-08-03T21:56:50Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - TranSOP: Transformer-based Multimodal Classification for Stroke
Treatment Outcome Prediction [2.358784542343728]
We propose a transformer-based multimodal network (TranSOP) for a classification approach that employs clinical metadata and imaging information.
This includes a fusion module to efficiently combine 3D non-contrast computed tomography (NCCT) features and clinical information.
In comparative experiments using unimodal and multimodal data, we achieve a state-of-the-art AUC score of 0.85.
arXiv Detail & Related papers (2023-01-25T21:05:10Z) - Interpretable CNN-Multilevel Attention Transformer for Rapid Recognition
of Pneumonia from Chest X-Ray Images [2.1408385210297656]
This paper develops a pneumonia recognition framework with interpretability to provide high-speed analytics support for medical practice.
To reduce the computational complexity to accelerate the recognition process, a novel multi-level self-attention mechanism within Transformer has been proposed.
The effectiveness of the proposed method has been demonstrated on the classic COVID-19 recognition task using the widespread pneumonia CXR image dataset.
arXiv Detail & Related papers (2022-10-29T12:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.