Exemplar Auditing for Multi-Label Biomedical Text Classification
- URL: http://arxiv.org/abs/2004.03093v1
- Date: Tue, 7 Apr 2020 02:54:20 GMT
- Title: Exemplar Auditing for Multi-Label Biomedical Text Classification
- Authors: Allen Schmaltz and Andrew Beam
- Abstract summary: We generalize a recently proposed zero-shot sequence labeling method, "supervised labeling via a convolutional decomposition"
The approach yields classification with "introspection", relating the fine-grained features of an inference-time prediction to their nearest neighbors.
Our proposed approach yields both a competitively effective classification model and an interrogation mechanism to aid healthcare workers in understanding the salient features that drive the model's predictions.
- Score: 0.4873362301533824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many practical applications of AI in medicine consist of semi-supervised
discovery: The investigator aims to identify features of interest at a
resolution more fine-grained than that of the available human labels. This is
often the scenario faced in healthcare applications as coarse, high-level
labels (e.g., billing codes) are often the only sources that are readily
available. These challenges are compounded for modalities such as text, where
the feature space is very high-dimensional, and often contains considerable
amounts of noise.
In this work, we generalize a recently proposed zero-shot sequence labeling
method, "binary labeling via a convolutional decomposition", to the case where
the available document-level human labels are themselves relatively
high-dimensional. The approach yields classification with "introspection",
relating the fine-grained features of an inference-time prediction to their
nearest neighbors from the training set, under the model. The approach is
effective, yet parsimonious, as demonstrated on a well-studied MIMIC-III
multi-label classification task of electronic health record data, and is useful
as a tool for organizing the analysis of neural model predictions and
high-dimensional datasets. Our proposed approach yields both a competitively
effective classification model and an interrogation mechanism to aid healthcare
workers in understanding the salient features that drive the model's
predictions.
Related papers
- A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation [22.323705343864336]
International Classification of Diseases (ICD) serves as a definitive medical classification system.
The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record.
Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection.
arXiv Detail & Related papers (2024-05-29T13:54:30Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - Automated Labeling of German Chest X-Ray Radiology Reports using Deep
Learning [50.591267188664666]
We propose a deep learning-based CheXpert label prediction model, pre-trained on reports labeled by a rule-based German CheXpert model.
Our results demonstrate the effectiveness of our approach, which significantly outperformed the rule-based model on all three tasks.
arXiv Detail & Related papers (2023-06-09T16:08:35Z) - HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented
Prompting [33.1455954220194]
HiPrompt is a supervision-efficient knowledge fusion framework.
It elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts.
Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.
arXiv Detail & Related papers (2023-04-12T16:54:26Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Multi-class versus One-class classifier in spontaneous speech analysis
oriented to Alzheimer Disease diagnosis [58.720142291102135]
The aim of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from speech signal.
The use of information about outlier and Fractal Dimension features improves the system performance.
arXiv Detail & Related papers (2022-03-21T09:57:20Z) - Learning Image Labels On-the-fly for Training Robust Classification
Models [13.669654965671604]
We show how noisy annotations (e.g., from different algorithm-based labelers) can be utilized together and mutually benefit the learning of classification tasks.
A meta-training based label-sampling module is designed to attend the labels that benefit the model learning the most through additional back-propagation processes.
arXiv Detail & Related papers (2020-09-22T05:38:44Z) - Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine
Pseudo-Labeling with Visual-Semantic Meta-Embedding [13.063136901934865]
Few-shot learning aims at rapidly adapting to novel categories with only a handful of samples at test time.
In this paper, we advance the few-shot classification paradigm towards a more challenging scenario, i.e., cross-granularity few-shot classification.
We approximate the fine-grained data distribution by greedy clustering of each coarse-class into pseudo-fine-classes according to the similarity of image embeddings.
arXiv Detail & Related papers (2020-07-11T03:44:21Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.