Self-Supervised Representation Learning for Detection of ACL Tear Injury
in Knee MR Videos
- URL: http://arxiv.org/abs/2007.07761v3
- Date: Mon, 14 Dec 2020 12:27:43 GMT
- Title: Self-Supervised Representation Learning for Detection of ACL Tear Injury
in Knee MR Videos
- Authors: Siladittya Manna, Saumik Bhattacharya, Umapada Pal
- Abstract summary: We propose a self-supervised learning approach to learn transferable features from MR video clips by enforcing the model to learn anatomical features.
To the best of our knowledge, none of the supervised learning models performing injury classification task from MR video provide any explanation for the decisions made by the models.
- Score: 18.54362818156725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep learning based models for computer vision applications
requires large scale human annotated data which are often expensive to
generate. Self-supervised learning, a subset of unsupervised learning, handles
this problem by learning meaningful features from unlabeled image or video
data. In this paper, we propose a self-supervised learning approach to learn
transferable features from MR video clips by enforcing the model to learn
anatomical features. The pretext task models are designed to predict the
correct ordering of the jumbled image patches that the MR video frames are
divided into. To the best of our knowledge, none of the supervised learning
models performing injury classification task from MR video provide any
explanation for the decisions made by the models and hence makes our work the
first of its kind on MR video data. Experiments on the pretext task show that
this proposed approach enables the model to learn spatial context invariant
features which help for reliable and explainable performance in downstream
tasks like classification of Anterior Cruciate Ligament tear injury from knee
MRI. The efficiency of the novel Convolutional Neural Network proposed in this
paper is reflected in the experimental results obtained in the downstream task.
Related papers
- Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Multi-task UNet: Jointly Boosting Saliency Prediction and Disease
Classification on Chest X-ray Images [3.8637285238278434]
This paper describes a novel deep learning model for visual saliency prediction on chest X-ray (CXR) images.
To cope with data deficiency, we exploit the multi-task learning method and tackles disease classification on CXR simultaneously.
Experiments show our proposed deep learning model with our new learning scheme can outperform existing methods dedicated either for saliency prediction or image classification.
arXiv Detail & Related papers (2022-02-15T01:12:42Z) - Machine Learning Method for Functional Assessment of Retinal Models [5.396946042201311]
We introduce the functional assessment (FA) of retinal models, which describes the concept of evaluating their performance.
We present a machine learning method for FA: we feed traditional machine learning classifiers with RGC responses generated by retinal models.
We show that differences in the structure of datasets result in largely divergent performance of the retinal model.
arXiv Detail & Related papers (2022-02-05T00:35:38Z) - Learning from Mistakes based on Class Weighting with Application to
Neural Architecture Search [12.317568257671427]
We propose a simple and effective multi-level optimization framework called learning from mistakes (LFM)
The primary objective is to train a model to perform effectively on target tasks by using a re-weighting technique to prevent similar mistakes in the future.
In this formulation, we learn the class weights by minimizing the validation loss of the model and re-train the model with the synthetic data from the image generator weighted by class-wise performance and real data.
arXiv Detail & Related papers (2021-12-01T04:56:49Z) - SSLM: Self-Supervised Learning for Medical Diagnosis from MR Video [19.5917119072985]
In this paper, we propose a self-supervised learning approach to learn the spatial anatomical representations from magnetic resonance (MR) video clips.
The proposed pretext model learns meaningful spatial context-invariant representations.
Different experiments show that the features learnt by the pretext model provide explainable performance in the downstream task.
arXiv Detail & Related papers (2021-04-21T12:01:49Z) - Neuro-Symbolic Representations for Video Captioning: A Case for
Leveraging Inductive Biases for Vision and Language [148.0843278195794]
We propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.
Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions.
arXiv Detail & Related papers (2020-11-18T20:21:19Z) - Memory-augmented Dense Predictive Coding for Video Representation
Learning [103.69904379356413]
We propose a new architecture and learning framework Memory-augmented Predictive Coding (MemDPC) for the task.
We investigate visual-only self-supervised video representation learning from RGB frames, or from unsupervised optical flow, or both.
In all cases, we demonstrate state-of-the-art or comparable performance over other approaches with orders of magnitude fewer training data.
arXiv Detail & Related papers (2020-08-03T17:57:01Z) - Self-supervised Representation Learning for Ultrasound Video [18.515314344284445]
We propose a self-supervised learning approach to learn meaningful and transferable representations from medical imaging video.
We force the model to address anatomy-aware tasks with free supervision from the data itself.
Experiments on fetal ultrasound video show that the proposed approach can effectively learn meaningful and strong representations.
arXiv Detail & Related papers (2020-02-28T23:00:26Z) - Object Relational Graph with Teacher-Recommended Learning for Video
Captioning [92.48299156867664]
We propose a complete video captioning system including both a novel model and an effective training strategy.
Specifically, we propose an object relational graph (ORG) based encoder, which captures more detailed interaction features to enrich visual representation.
Meanwhile, we design a teacher-recommended learning (TRL) method to make full use of the successful external language model (ELM) to integrate the abundant linguistic knowledge into the caption model.
arXiv Detail & Related papers (2020-02-26T15:34:52Z) - Modality Compensation Network: Cross-Modal Adaptation for Action
Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities.
Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning.
Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.