MeniMV: A Multi-view Benchmark for Meniscus Injury Severity Grading
- URL: http://arxiv.org/abs/2512.18437v1
- Date: Sat, 20 Dec 2025 17:22:55 GMT
- Title: MeniMV: A Multi-view Benchmark for Meniscus Injury Severity Grading
- Authors: Shurui Xu, Siqi Yang, Jiapin Ren, Zhong Cao, Hongwei Yang, Mengzhen Fan, Yuyu Sun, Shuyan Li,
- Abstract summary: MeniMV comprises 3,000 annotated knee MRI exams from 750 patients across three medical centers.<n>Each exam is meticulously annotated with four-tier (grade 0-3) severity labels for both anterior and posterior meniscal horns.
- Score: 7.152945592798872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise grading of meniscal horn tears is critical in knee injury diagnosis but remains underexplored in automated MRI analysis. Existing methods often rely on coarse study-level labels or binary classification, lacking localization and severity information. In this paper, we introduce MeniMV, a multi-view benchmark dataset specifically designed for horn-specific meniscus injury grading. MeniMV comprises 3,000 annotated knee MRI exams from 750 patients across three medical centers, providing 6,000 co-registered sagittal and coronal images. Each exam is meticulously annotated with four-tier (grade 0-3) severity labels for both anterior and posterior meniscal horns, verified by chief orthopedic physicians. Notably, MeniMV offers more than double the pathology-labeled data volume of prior datasets while uniquely capturing the dual-view diagnostic context essential in clinical practice. To demonstrate the utility of MeniMV, we benchmark multiple state-of-the-art CNN and Transformer-based models. Our extensive experiments establish strong baselines and highlight challenges in severity grading, providing a valuable foundation for future research in automated musculoskeletal imaging.
Related papers
- Multi-View Stenosis Classification Leveraging Transformer-Based Multiple-Instance Learning Using Real-World Clinical Data [76.89269238957593]
Coronary artery stenosis is a leading cause of cardiovascular disease, diagnosed by analyzing the coronary arteries from multiple angiography views.<n>We propose SegmentMIL, a transformer-based multi-view multiple-instance learning framework for patient-level stenosis classification.
arXiv Detail & Related papers (2026-02-02T13:07:52Z) - MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI [66.0701326117134]
MedForget is a hierarchy-aware multimodal unlearning testbed for building compliant medical AI systems.<n>We show that existing methods struggle to achieve complete, hierarchy-aware forgetting without reducing diagnostic performance.<n>We introduce a reconstruction attack that progressively adds hierarchical level context to prompts.
arXiv Detail & Related papers (2025-12-10T17:55:06Z) - TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models [54.48710348910535]
Existing medical reasoning benchmarks primarily focus on analyzing a patient's condition based on an image from a single visit.<n>We introduce TemMed-Bench, the first benchmark designed for analyzing changes in patients' conditions between different clinical visits.
arXiv Detail & Related papers (2025-09-29T17:51:26Z) - MRN: Harnessing 2D Vision Foundation Models for Diagnosing Parkinson's Disease with Limited 3D MR Data [0.6183104361749774]
Current clinical practice often relies on diagnostic biomarkers in QSM and NM-MRI images.<n>We address these challenges by leveraging 2D vision foundation models (VFMs)<n>Our approach achieved first place in the MICCAI 2025 PDCADxFoundation challenge, with an accuracy of 86.4% trained on a dataset of only 300 labeled QSM and NM-MRI scans.
arXiv Detail & Related papers (2025-09-22T10:59:27Z) - Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models [57.73472878679636]
We introduce Med-RewardBench, the first benchmark specifically designed to evaluate medical reward models and judges.<n>Med-RewardBench features a multimodal dataset spanning 13 organ systems and 8 clinical departments, with 1,026 expert-annotated cases.<n>A rigorous three-step process ensures high-quality evaluation data across six clinically critical dimensions.
arXiv Detail & Related papers (2025-08-29T08:58:39Z) - Benchmarking and Explaining Deep Learning Cortical Lesion MRI Segmentation in Multiple Sclerosis [28.192924379673862]
Cortical lesions (CLs) have emerged as valuable biomarkers in multiple sclerosis (MS)<n>We propose a comprehensive benchmark of CL detection and segmentation in MRI.<n>We rely on the self-configuring nnU-Net framework, designed for medical imaging segmentation, and propose adaptations tailored to the improved CL detection.
arXiv Detail & Related papers (2025-07-16T09:56:11Z) - MedErr-CT: A Visual Question Answering Benchmark for Identifying and Correcting Errors in CT Reports [4.769418278782809]
We introduce MedErr-CT, a novel benchmark for evaluating medical MLLMs' ability to identify and correct errors in CT reports.<n>The benchmark includes six error categories - four vision-centric errors (Omission, Insertion, Direction, Size) and two lexical error types (Unit, Typo)
arXiv Detail & Related papers (2025-06-24T00:51:03Z) - EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model [51.66031028717933]
Medical Large Vision-Language Models (Med-LVLMs) demonstrate significant potential in healthcare.<n>Currently, intelligent ophthalmic diagnosis faces three major challenges: (i) Data; (ii) Benchmark; and (iii) Model.<n>We propose the Eyecare Kit, which tackles the aforementioned three key challenges with the tailored dataset, benchmark and model.
arXiv Detail & Related papers (2025-04-18T12:09:15Z) - Clinical Utility of Foundation Segmentation Models in Musculoskeletal MRI: Biomarker Fidelity and Predictive Outcomes [0.0]
We evaluate three widely used segmentation models (SAM, SAM2, MedSAM) across eleven musculoskeletal (MSK) MRI datasets.<n>Our framework assesses both zero-shot and finetuned performance, with attention to segmentation accuracy, generalizability across imaging protocols, and reliability of derived quantitative biomarkers.
arXiv Detail & Related papers (2025-01-23T04:41:20Z) - Arges: Spatio-Temporal Transformer for Ulcerative Colitis Severity Assessment in Endoscopy Videos [2.0735422289416605]
Expert MES/UCEIS annotation is time-consuming and susceptible to inter-rater variability.
CNN-based weakly-supervised models with end-to-end (e2e) training lack generalization to new disease scores.
"Arges" is a deep learning framework that incorporates positional encoding to estimate disease severity scores in endoscopy.
arXiv Detail & Related papers (2024-10-01T09:23:14Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - A Benchmark for Studying Diabetic Retinopathy: Segmentation, Grading,
and Transferability [76.64661091980531]
People with diabetes are at risk of developing diabetic retinopathy (DR)
Computer-aided DR diagnosis is a promising tool for early detection of DR and severity grading.
This dataset has 1,842 images with pixel-level DR-related lesion annotations, and 1,000 images with image-level labels graded by six board-certified ophthalmologists.
arXiv Detail & Related papers (2020-08-22T07:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.