Outlier Guided Optimization of Abdominal Segmentation
- URL: http://arxiv.org/abs/2002.04098v1
- Date: Mon, 10 Feb 2020 21:41:52 GMT
- Title: Outlier Guided Optimization of Abdominal Segmentation
- Authors: Yuchen Xu, Olivia Tang, Yucheng Tang, Ho Hin Lee, Yunqiang Chen,
Dashan Gao, Shizhong Han, Riqiang Gao, Michael R. Savona, Richard G.
Abramson, Yuankai Huo, Bennett A. Landman
- Abstract summary: We build on a pre-trained 3D U-Net model for abdominal multi-organ segmentation.
We augmented the dataset either with outlier data (e.g., exemplars for which the baseline algorithm failed) or inliers (e.g., exemplars for which the baseline algorithm worked)
We find that the marginal value of adding outliers is higher than that of adding inliers.
- Score: 7.036733782879497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abdominal multi-organ segmentation of computed tomography (CT) images has
been the subject of extensive research interest. It presents a substantial
challenge in medical image processing, as the shape and distribution of
abdominal organs can vary greatly among the population and within an individual
over time. While continuous integration of novel datasets into the training set
provides potential for better segmentation performance, collection of data at
scale is not only costly, but also impractical in some contexts. Moreover, it
remains unclear what marginal value additional data have to offer. Herein, we
propose a single-pass active learning method through human quality assurance
(QA). We built on a pre-trained 3D U-Net model for abdominal multi-organ
segmentation and augmented the dataset either with outlier data (e.g.,
exemplars for which the baseline algorithm failed) or inliers (e.g., exemplars
for which the baseline algorithm worked). The new models were trained using the
augmented datasets with 5-fold cross-validation (for outlier data) and withheld
outlier samples (for inlier data). Manual labeling of outliers increased Dice
scores with outliers by 0.130, compared to an increase of 0.067 with inliers
(p<0.001, two-tailed paired t-test). By adding 5 to 37 inliers or outliers to
training, we find that the marginal value of adding outliers is higher than
that of adding inliers. In summary, improvement on single-organ performance was
obtained without diminishing multi-organ performance or significantly
increasing training time. Hence, identification and correction of baseline
failure cases present an effective and efficient method of selecting training
data to improve algorithm performance.
Related papers
- Fine-tuning can Help Detect Pretraining Data from Large Language Models [7.7209640786782385]
Current methods differentiate members and non-members by designing scoring functions, like Perplexity and Min-k%.
We introduce a novel and effective method termed Fine-tuned Score Deviation (FSD), which improves the performance of current scoring functions for pretraining data detection.
arXiv Detail & Related papers (2024-10-09T15:36:42Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Enhancing Sentiment Analysis Results through Outlier Detection
Optimization [0.5439020425819]
This study investigates the potential of identifying and addressing outliers in text data with subjective labels.
We utilize the Deep SVDD algorithm, a one-class classification method, to detect outliers in nine text-based emotion and sentiment analysis datasets.
arXiv Detail & Related papers (2023-11-25T18:20:43Z) - Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks.
We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process.
Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z) - Semantically Redundant Training Data Removal and Deep Model
Classification Performance: A Study with Chest X-rays [5.454938535500864]
We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data.
We demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set.
arXiv Detail & Related papers (2023-09-18T13:56:34Z) - Learning from Partially Overlapping Labels: Image Segmentation under
Annotation Shift [68.6874404805223]
We propose several strategies for learning from partially overlapping labels in the context of abdominal organ segmentation.
We find that combining a semi-supervised approach with an adaptive cross entropy loss can successfully exploit heterogeneously annotated data.
arXiv Detail & Related papers (2021-07-13T09:22:24Z) - Hierarchical Self-Supervised Learning for Medical Image Segmentation
Based on Multi-Domain Data Aggregation [23.616336382437275]
We propose Hierarchical Self-Supervised Learning (HSSL) for medical image segmentation.
We first aggregate a dataset from several medical challenges, then pre-train the network in a self-supervised manner, and finally fine-tune on labeled data.
Compared to learning from scratch, our new method yields better performance on various tasks.
arXiv Detail & Related papers (2021-07-10T18:17:57Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Automatic Data Augmentation via Deep Reinforcement Learning for
Effective Kidney Tumor Segmentation [57.78765460295249]
We develop a novel automatic learning-based data augmentation method for medical image segmentation.
In our method, we innovatively combine the data augmentation module and the subsequent segmentation module in an end-to-end training manner with a consistent loss.
We extensively evaluated our method on CT kidney tumor segmentation which validated the promising results of our method.
arXiv Detail & Related papers (2020-02-22T14:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.