Learning from Multiple Datasets with Heterogeneous and Partial Labels
for Universal Lesion Detection in CT
- URL: http://arxiv.org/abs/2009.02577v3
- Date: Sun, 3 Jan 2021 18:55:59 GMT
- Title: Learning from Multiple Datasets with Heterogeneous and Partial Labels
for Universal Lesion Detection in CT
- Authors: Ke Yan, Jinzheng Cai, Youjing Zheng, Adam P. Harrison, Dakai Jin,
Youbao Tang, Yuxing Tang, Lingyun Huang, Jing Xiao, Le Lu
- Abstract summary: We build a simple yet effective lesion detection framework named Lesion ENSemble (LENS)
LENS can efficiently learn from multiple heterogeneous lesion datasets in a multi-task fashion.
We train our framework on four public lesion datasets and evaluate it on 800 manually-labeled sub-volumes in DeepLesion.
- Score: 25.351709433029896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale datasets with high-quality labels are desired for training
accurate deep learning models. However, due to the annotation cost, datasets in
medical imaging are often either partially-labeled or small. For example,
DeepLesion is such a large-scale CT image dataset with lesions of various
types, but it also has many unlabeled lesions (missing annotations). When
training a lesion detector on a partially-labeled dataset, the missing
annotations will generate incorrect negative signals and degrade the
performance. Besides DeepLesion, there are several small single-type datasets,
such as LUNA for lung nodules and LiTS for liver tumors. These datasets have
heterogeneous label scopes, i.e., different lesion types are labeled in
different datasets with other types ignored. In this work, we aim to develop a
universal lesion detection algorithm to detect a variety of lesions. The
problem of heterogeneous and partial labels is tackled. First, we build a
simple yet effective lesion detection framework named Lesion ENSemble (LENS).
LENS can efficiently learn from multiple heterogeneous lesion datasets in a
multi-task fashion and leverage their synergy by proposal fusion. Next, we
propose strategies to mine missing annotations from partially-labeled datasets
by exploiting clinical prior knowledge and cross-dataset knowledge transfer.
Finally, we train our framework on four public lesion datasets and evaluate it
on 800 manually-labeled sub-volumes in DeepLesion. Our method brings a relative
improvement of 49% compared to the current state-of-the-art approach in the
metric of average sensitivity. We have publicly released our manual 3D
annotations of DeepLesion in
https://github.com/viggin/DeepLesion_manual_test_set.
Related papers
- Cross-Dataset Adaptation for Instrument Classification in Cataract
Surgery Videos [54.1843419649895]
State-of-the-art models, which perform this task well on a particular dataset, perform poorly when tested on another dataset.
We propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor.
In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains.
arXiv Detail & Related papers (2023-07-31T18:14:18Z) - Weakly-supervised positional contrastive learning: application to
cirrhosis classification [45.63061034568991]
Large medical imaging datasets can be cheaply annotated with low-confidence, weak labels.
Access to high-confidence labels, such as histology-based diagnoses, is rare and costly.
We propose an efficient weakly-supervised positional (WSP) contrastive learning strategy.
arXiv Detail & Related papers (2023-07-10T15:02:13Z) - An End-to-End Framework For Universal Lesion Detection With Missing
Annotations [24.902835211573628]
We present a novel end-to-end framework for mining unlabeled lesions while simultaneously training the detector.
Our framework follows the teacher-student paradigm. High-confidence predictions are combined with partially-labeled ground truth for training the student model.
arXiv Detail & Related papers (2023-03-27T09:16:10Z) - Transfer learning with weak labels from radiology reports: application
to glioma change detection [0.2010294990327175]
We propose a combined use of weak labels (imprecise, but fast-to-create annotations) and Transfer Learning (TL)
Specifically, we explore inductive TL, where source and target domains are identical, but tasks are different due to a label shift.
We investigate the relationship between model size and TL, comparing a low-capacity VGG with a higher-capacity SEResNeXt.
arXiv Detail & Related papers (2022-10-18T09:15:27Z) - Pseudo-label refinement using superpixels for semi-supervised brain
tumour segmentation [0.6767885381740952]
Training neural networks using limited annotations is an important problem in the medical domain.
Semi-supervised learning aims to overcome this problem by learning segmentations with very little annotated data.
We propose a framework based on superpixels to improve the accuracy of the pseudo labels.
arXiv Detail & Related papers (2021-10-16T15:17:11Z) - Label Cleaning Multiple Instance Learning: Refining Coarse Annotations
on Single Whole-Slide Images [83.7047542725469]
Annotating cancerous regions in whole-slide images (WSIs) of pathology samples plays a critical role in clinical diagnosis, biomedical research, and machine learning algorithms development.
We present a method, named Label Cleaning Multiple Instance Learning (LC-MIL), to refine coarse annotations on a single WSI without the need of external training data.
Our experiments on a heterogeneous WSI set with breast cancer lymph node metastasis, liver cancer, and colorectal cancer samples show that LC-MIL significantly refines the coarse annotations, outperforming the state-of-the-art alternatives, even while learning from a single slide.
arXiv Detail & Related papers (2021-09-22T15:06:06Z) - Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions
Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets.
A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies.
A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z) - Towards Robust Partially Supervised Multi-Structure Medical Image
Segmentation on Small-Scale Data [123.03252888189546]
We propose Vicinal Labels Under Uncertainty (VLUU) to bridge the methodological gaps in partially supervised learning (PSL) under data scarcity.
Motivated by multi-task learning and vicinal risk minimization, VLUU transforms the partially supervised problem into a fully supervised problem by generating vicinal labels.
Our research suggests a new research direction in label-efficient deep learning with partial supervision.
arXiv Detail & Related papers (2020-11-28T16:31:00Z) - ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised
Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization.
ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset.
We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z) - Universal Lesion Detection by Learning from Multiple Heterogeneously
Labeled Datasets [23.471903581482668]
We learn a multi-head multi-task lesion detector using all datasets and generate lesion proposals on DeepLesion.
We discover suspicious but unannotated lesions using knowledge transfer from single-type lesion detectors.
Our method outperforms the current state-of-the-art approach by 29% in the metric of average sensitivity.
arXiv Detail & Related papers (2020-05-28T02:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.