Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By
Comparing Image Representations
- URL: http://arxiv.org/abs/2007.07423v3
- Date: Wed, 22 Jul 2020 03:00:56 GMT
- Title: Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs By
Comparing Image Representations
- Authors: Hong-Yu Zhou and Shuang Yu and Cheng Bian and Yifan Hu and Kai Ma and
Yefeng Zheng
- Abstract summary: We propose a new pretraining method which learns from 700k radiographs given no manual annotations.
We call our method as Comparing to Learn (C2L) because it learns robust features by comparing different image representations.
The experimental results on radiographs show that C2L can outperform ImageNet pretraining and previous state-of-the-art approaches significantly.
- Score: 39.08296644280442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In deep learning era, pretrained models play an important role in medical
image analysis, in which ImageNet pretraining has been widely adopted as the
best way. However, it is undeniable that there exists an obvious domain gap
between natural images and medical images. To bridge this gap, we propose a new
pretraining method which learns from 700k radiographs given no manual
annotations. We call our method as Comparing to Learn (C2L) because it learns
robust features by comparing different image representations. To verify the
effectiveness of C2L, we conduct comprehensive ablation studies and evaluate it
on different tasks and datasets. The experimental results on radiographs show
that C2L can outperform ImageNet pretraining and previous state-of-the-art
approaches significantly. Code and models are available.
Related papers
- CROCODILE: Causality aids RObustness via COntrastive DIsentangled LEarning [8.975676404678374]
We introduce our CROCODILE framework, showing how tools from causality can foster a model's robustness to domain shift.
We apply our method to multi-label lung disease classification from CXRs, utilizing over 750000 images.
arXiv Detail & Related papers (2024-08-09T09:08:06Z) - Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging [14.250975981451914]
We show that ImageNet and RadImageNet achieve comparable classification performance.
ImageNet is much more prone to overfitting to confounders.
We recommend that researchers using ImageNet-pretrained models reexamine their model.
arXiv Detail & Related papers (2024-03-07T13:36:15Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Annotation Cost Efficient Active Learning for Content Based Image
Retrieval [1.6624384368855525]
We present an annotation cost efficient active learning (AL) method (denoted as ANNEAL)
The proposed method aims to iteratively enrich the training set by annotating the most informative image pairs as similar or dissimilar.
The code of ANNEAL is publicly available at https://git.tu-berlin.de/rsim/ANNEAL.
arXiv Detail & Related papers (2023-06-20T15:33:24Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Revisiting Hidden Representations in Transfer Learning for Medical
Imaging [2.4545492329339815]
We compare ImageNet and RadImageNet on seven medical classification tasks.
Our results indicate that, contrary to intuition, ImageNet and RadImageNet may converge to distinct intermediate representations.
Our findings show that the similarity between networks before and after fine-tuning does not correlate with performance gains.
arXiv Detail & Related papers (2023-02-16T13:04:59Z) - CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language
Representation Alignment [146.3128011522151]
We propose a Omni Crossmodal Learning method equipped with a Video Proxy mechanism on the basis of CLIP, namely CLIP-ViP.
Our approach improves the performance of CLIP on video-text retrieval by a large margin.
Our model also achieves SOTA results on a variety of datasets, including MSR-VTT, DiDeMo, LSMDC, and ActivityNet.
arXiv Detail & Related papers (2022-09-14T05:47:02Z) - VL-LTR: Learning Class-wise Visual-Linguistic Representation for
Long-Tailed Visual Recognition [61.75391989107558]
We present a visual-linguistic long-tailed recognition framework, termed VL-LTR.
Our method can learn visual representation from images and corresponding linguistic representation from noisy class-level text descriptions.
Notably, our method achieves 77.2% overall accuracy on ImageNet-LT, which significantly outperforms the previous best method by over 17 points.
arXiv Detail & Related papers (2021-11-26T16:24:03Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - Contrastive Learning of Medical Visual Representations from Paired
Images and Text [38.91117443316013]
We propose ConVIRT, an unsupervised strategy to learn medical visual representations by exploiting naturally occurring descriptive paired text.
Our new method of pretraining medical image encoders with the paired text data via a bidirectional contrastive objective between the two modalities is domain-agnostic, and requires no additional expert input.
arXiv Detail & Related papers (2020-10-02T02:10:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.