UAE: Universal Anatomical Embedding on Multi-modality Medical Images
- URL: http://arxiv.org/abs/2311.15111v3
- Date: Thu, 18 Jan 2024 09:02:36 GMT
- Title: UAE: Universal Anatomical Embedding on Multi-modality Medical Images
- Authors: Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Jingjing Lu, Xianghua Ye, Ke
Yan, and Yong Xia
- Abstract summary: We propose universal anatomical embedding (UAE) to learn appearance, semantic, and cross-modality anatomical embeddings.
UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning.
Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark based medical image analysis tasks.
- Score: 7.589247017940839
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Identifying specific anatomical structures (\textit{e.g.}, lesions or
landmarks) in medical images plays a fundamental role in medical image
analysis. Exemplar-based landmark detection methods are receiving increasing
attention since they can detect arbitrary anatomical points in inference while
do not need landmark annotations in training. They use self-supervised learning
to acquire a discriminative embedding for each voxel within the image. These
approaches can identify corresponding landmarks through nearest neighbor
matching and has demonstrated promising results across various tasks. However,
current methods still face challenges in: (1) differentiating voxels with
similar appearance but different semantic meanings (\textit{e.g.}, two adjacent
structures without clear borders); (2) matching voxels with similar semantics
but markedly different appearance (\textit{e.g.}, the same vessel before and
after contrast injection); and (3) cross-modality matching (\textit{e.g.},
CT-MRI landmark-based registration). To overcome these challenges, we propose
universal anatomical embedding (UAE), which is a unified framework designed to
learn appearance, semantic, and cross-modality anatomical embeddings.
Specifically, UAE incorporates three key innovations: (1) semantic embedding
learning with prototypical contrastive loss; (2) a fixed-point-based matching
strategy; and (3) an iterative approach for cross-modality embedding learning.
We thoroughly evaluated UAE across intra- and inter-modality tasks, including
one-shot landmark detection, lesion tracking on longitudinal CT scans, and
CT-MRI affine/rigid registration with varying field of view. Our results
suggest that UAE outperforms state-of-the-art methods, offering a robust and
versatile approach for landmark based medical image analysis tasks. Code and
trained models are available at: \href{https://shorturl.at/bgsB3}
Related papers
- Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - Anatomy-guided Pathology Segmentation [56.883822515800205]
We develop a generalist segmentation model that combines anatomical and pathological information, aiming to enhance the segmentation accuracy of pathological features.
Our Anatomy-Pathology Exchange (APEx) training utilizes a query-based segmentation transformer which decodes a joint feature space into query-representations for human anatomy.
In doing so, we are able to report the best results across the board on FDG-PET-CT and Chest X-Ray pathology segmentation tasks with a margin of up to 3.3% as compared to strong baseline methods.
arXiv Detail & Related papers (2024-07-08T11:44:15Z) - DenseSeg: Joint Learning for Semantic Segmentation and Landmark Detection Using Dense Image-to-Shape Representation [1.342749532731493]
We propose a dense image-to-shape representation that enables the joint learning of landmarks and semantic segmentation.
Our method intuitively allows the extraction of arbitrary landmarks due to its representation of anatomical correspondences.
arXiv Detail & Related papers (2024-05-30T06:49:59Z) - Anatomical Structure-Guided Medical Vision-Language Pre-training [21.68719061251635]
We propose an Anatomical Structure-Guided (ASG) framework for learning medical visual representations.
For anatomical region, we design an automatic anatomical region-sentence alignment paradigm in collaboration with radiologists.
For finding and existence, we regard them as image tags, applying an image-tag recognition decoder to associate image features with their respective tags within each sample.
arXiv Detail & Related papers (2024-03-14T11:29:47Z) - MAP: Domain Generalization via Meta-Learning on Anatomy-Consistent
Pseudo-Modalities [12.194439938007672]
We propose Meta learning on Anatomy-consistent Pseudo-modalities (MAP)
MAP improves model generalizability by learning structural features.
We evaluate our model on seven public datasets of various retinal imaging modalities.
arXiv Detail & Related papers (2023-09-03T22:56:22Z) - Region-based Contrastive Pretraining for Medical Image Retrieval with
Anatomic Query [56.54255735943497]
Region-based contrastive pretraining for Medical Image Retrieval (RegionMIR)
We introduce a novel Region-based contrastive pretraining for Medical Image Retrieval (RegionMIR)
arXiv Detail & Related papers (2023-05-09T16:46:33Z) - Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology
Report Generation [48.723504098917324]
We propose an Unify, Align and then Refine (UAR) approach to learn multi-level cross-modal alignments.
We introduce three novel modules: Latent Space Unifier, Cross-modal Representation Aligner and Text-to-Image Refiner.
Experiments and analyses on IU-Xray and MIMIC-CXR benchmark datasets demonstrate the superiority of our UAR against varied state-of-the-art methods.
arXiv Detail & Related papers (2023-03-28T12:42:12Z) - Multi-Granularity Cross-modal Alignment for Generalized Medical Visual
Representation Learning [24.215619918283462]
We present a novel framework for learning medical visual representations directly from paired radiology reports.
Our framework harnesses the naturally exhibited semantic correspondences between medical image and radiology reports at three different levels.
arXiv Detail & Related papers (2022-10-12T09:31:39Z) - Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA)
First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features.
Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z) - Structured Landmark Detection via Topology-Adapting Deep Graph Learning [75.20602712947016]
We present a new topology-adapting deep graph learning approach for accurate anatomical facial and medical landmark detection.
The proposed method constructs graph signals leveraging both local image features and global shape features.
Experiments are conducted on three public facial image datasets (WFLW, 300W, and COFW-68) as well as three real-world X-ray medical datasets (Cephalometric (public), Hand and Pelvis)
arXiv Detail & Related papers (2020-04-17T11:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.