DIAGNOSE: Avoiding Out-of-distribution Data using Submodular Information
Measures
- URL: http://arxiv.org/abs/2210.01526v1
- Date: Tue, 4 Oct 2022 11:07:48 GMT
- Title: DIAGNOSE: Avoiding Out-of-distribution Data using Submodular Information
Measures
- Authors: Suraj Kothawade, Akshit Srivastava, Venkat Iyer, Ganesh Ramakrishnan,
Rishabh Iyer
- Abstract summary: We propose Diagnose, a novel active learning framework that can jointly model similarity and dissimilarity.
Our experiments verify the superiority of Diagnose over the state-of-the-art AL methods across multiple domains of medical imaging.
- Score: 13.492292022589918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Avoiding out-of-distribution (OOD) data is critical for training supervised
machine learning models in the medical imaging domain. Furthermore, obtaining
labeled medical data is difficult and expensive since it requires expert
annotators like doctors, radiologists, etc. Active learning (AL) is a
well-known method to mitigate labeling costs by selecting the most diverse or
uncertain samples. However, current AL methods do not work well in the medical
imaging domain with OOD data. We propose Diagnose (avoiDing out-of-dIstribution
dAta usinG submodular iNfOrmation meaSurEs), a novel active learning framework
that can jointly model similarity and dissimilarity, which is crucial in mining
in-distribution data and avoiding OOD data at the same time. Particularly, we
use a small number of data points as exemplars that represent a query set of
in-distribution data points and a private set of OOD data points. We illustrate
the generalizability of our framework by evaluating it on a wide variety of
real-world OOD scenarios. Our experiments verify the superiority of Diagnose
over the state-of-the-art AL methods across multiple domains of medical
imaging.
Related papers
- Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis [15.964609888720315]
We propose to limit the amount of information deep learning models use to reach the final classification, by using a multiple instance learning framework.
We evaluate our framework on two medical applications: skin cancer diagnosis using dermoscopy and breast cancer diagnosis using mammography.
arXiv Detail & Related papers (2024-05-02T18:21:25Z) - EndoOOD: Uncertainty-aware Out-of-distribution Detection in Capsule
Endoscopy Diagnosis [11.82953216903558]
Wireless capsule endoscopy (WCE) is a non-invasive diagnostic procedure that enables visualization of the gastrointestinal (GI) tract.
Deep learning-based methods have shown effectiveness in disease screening using WCE data.
Existing capsule endoscopy classification methods mostly rely on pre-defined categories.
arXiv Detail & Related papers (2024-02-18T06:54:51Z) - EAT: Towards Long-Tailed Out-of-Distribution Detection [55.380390767978554]
This paper addresses the challenging task of long-tailed OOD detection.
The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes.
We propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes, and (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data.
arXiv Detail & Related papers (2023-12-14T13:47:13Z) - GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection [67.90365841083951]
We develop a new graph contrastive learning framework GOOD-D for detecting OOD graphs without using any ground-truth labels.
GOOD-D is able to capture the latent ID patterns and accurately detect OOD graphs based on the semantic inconsistency in different granularities.
As a pioneering work in unsupervised graph-level OOD detection, we build a comprehensive benchmark to compare our proposed approach with different state-of-the-art methods.
arXiv Detail & Related papers (2022-11-08T12:41:58Z) - MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis [10.133715767542386]
We propose a knowledge-driven and data-driven framework for lung disease diagnosis.
We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data.
A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
arXiv Detail & Related papers (2022-02-09T04:12:30Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - CVAD: A generic medical anomaly detector based on Cascade VAE [2.647674705784439]
We focus on the generalizability of OOD detection for medical images and propose a self-supervised Cascade Variational autoencoder-based Anomaly Detector (CVAD)
We use a variational autoencoders' cascade architecture, which combines latent representation at multiple scales, before being fed to a discriminator to distinguish the OOD data from the in-distribution (ID) data.
We compare the performance with the state-of-the-art deep learning models to demonstrate our model's efficacy on various open-access medical imaging datasets for both intra- and inter-class OOD.
arXiv Detail & Related papers (2021-10-29T14:20:43Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.