Compositionally Equivariant Representation Learning
- URL: http://arxiv.org/abs/2306.07783v2
- Date: Sat, 17 Jun 2023 13:10:40 GMT
- Title: Compositionally Equivariant Representation Learning
- Authors: Xiao Liu, Pedro Sanchez, Spyridon Thermos, Alison Q. O'Neil and
Sotirios A. Tsaftaris
- Abstract summary: Humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans.
This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings.
We study the utilisation of compositionality in learning more interpretable and generalisable representations for medical image segmentation.
- Score: 22.741376970643973
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models often need sufficient supervision (i.e. labelled data)
in order to be trained effectively. By contrast, humans can swiftly learn to
identify important anatomy in medical images like MRI and CT scans, with
minimal guidance. This recognition capability easily generalises to new images
from different medical facilities and to new tasks in different settings. This
rapid and generalisable learning ability is largely due to the compositional
structure of image patterns in the human brain, which are not well represented
in current medical models. In this paper, we study the utilisation of
compositionality in learning more interpretable and generalisable
representations for medical image segmentation. Overall, we propose that the
underlying generative factors that are used to generate the medical images
satisfy compositional equivariance property, where each factor is compositional
(e.g. corresponds to the structures in human anatomy) and also equivariant to
the task. Hence, a good representation that approximates well the ground truth
factor has to be compositionally equivariant. By modelling the compositional
representations with learnable von-Mises-Fisher (vMF) kernels, we explore how
different design and learning biases can be used to enforce the representations
to be more compositionally equivariant under un-, weakly-, and semi-supervised
settings. Extensive results show that our methods achieve the best performance
over several strong baselines on the task of semi-supervised domain-generalised
medical image segmentation. Code will be made publicly available upon
acceptance at https://github.com/vios-s.
Related papers
- Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - MOSMOS: Multi-organ segmentation facilitated by medical report supervision [10.396987980136602]
We propose a novel pre-training & fine-tuning framework for Multi-Organ Supervision (MOS)
Specifically, we first introduce global contrastive learning to align medical image-report pairs in the pre-training stage.
To remedy the discrepancy, we further leverage multi-label recognition to implicitly learn the semantic correspondence between image pixels and organ tags.
arXiv Detail & Related papers (2024-09-04T03:46:17Z) - Enhancing Cross-Modal Medical Image Segmentation through Compositionality [0.4194295877935868]
We introduce compositionality as an inductive bias in a cross-modal segmentation network to improve segmentation performance and interpretability.
The proposed network enforces compositionality on the learned representations using learnable von Mises-Fisher kernels.
The experimental results demonstrate enhanced segmentation performance and reduced computational costs on multiple medical datasets.
arXiv Detail & Related papers (2024-08-21T15:57:24Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - SM2C: Boost the Semi-supervised Segmentation for Medical Image by using Meta Pseudo Labels and Mixed Images [13.971120210536995]
We introduce Scaling-up Mix with Multi-Class (SM2C) to improve the ability to learn semantic features within medical images.
By diversifying the shape of the segmentation objects and enriching the semantic information within each sample, the SM2C demonstrates its potential.
The proposed framework shows significant improvements over state-of-the-art counterparts.
arXiv Detail & Related papers (2024-03-24T04:39:40Z) - Implicit Anatomical Rendering for Medical Image Segmentation with
Stochastic Experts [11.007092387379078]
We propose MORSE, a generic implicit neural rendering framework designed at an anatomical level to assist learning in medical image segmentation.
Our approach is to formulate medical image segmentation as a rendering problem in an end-to-end manner.
Our experiments demonstrate that MORSE can work well with different medical segmentation backbones.
arXiv Detail & Related papers (2023-04-06T16:44:03Z) - Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA)
First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features.
Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z) - vMFNet: Compositionality Meets Domain-generalised Segmentation [22.741376970643973]
von-Mises-Fisher (vMF) kernels are robust to images collected from different domains.
The vMF likelihoods tell how likely each anatomical part is at each position of the image.
With a reconstruction module, unlabeled data can also be used to learn the vMF kernels and likelihoods.
arXiv Detail & Related papers (2022-06-29T11:31:23Z) - Generalized Organ Segmentation by Imitating One-shot Reasoning using
Anatomical Correlation [55.1248480381153]
We propose OrganNet which learns a generalized organ concept from a set of annotated organ classes and then transfer this concept to unseen classes.
We show that OrganNet can effectively resist the wide variations in organ morphology and produce state-of-the-art results in one-shot segmentation task.
arXiv Detail & Related papers (2021-03-30T13:41:12Z) - Evaluation of Complexity Measures for Deep Learning Generalization in
Medical Image Analysis [77.34726150561087]
PAC-Bayes flatness-based and path norm-based measures produce the most consistent explanation for the combination of models and data.
We also investigate the use of multi-task classification and segmentation approach for breast images.
arXiv Detail & Related papers (2021-03-04T20:58:22Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.