DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction
- URL: http://arxiv.org/abs/2409.10504v1
- Date: Mon, 16 Sep 2024 17:45:40 GMT
- Title: DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction
- Authors: John Wu, David Wu, Jimeng Sun,
- Abstract summary: Predicting high-dimensional or extreme multilabels, such as in medical coding, requires both accuracy and interpretability.
We propose a mechanistic interpretability module that disentangles uninterpretable dense embeddings into a sparse embedding space.
We show that our sparse embeddings are more human understandable than its dense counterparts by at least 50 percent.
- Score: 27.778160315671776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting high-dimensional or extreme multilabels, such as in medical coding, requires both accuracy and interpretability. Existing works often rely on local interpretability methods, failing to provide comprehensive explanations of the overall mechanism behind each label prediction within a multilabel set. We propose a mechanistic interpretability module called DIctionary Label Attention (\method) that disentangles uninterpretable dense embeddings into a sparse embedding space, where each nonzero element (a dictionary feature) represents a globally learned medical concept. Through human evaluations, we show that our sparse embeddings are more human understandable than its dense counterparts by at least 50 percent. Our automated dictionary feature identification pipeline, leveraging large language models (LLMs), uncovers thousands of learned medical concepts by examining and summarizing the highest activating tokens for each dictionary feature. We represent the relationships between dictionary features and medical codes through a sparse interpretable matrix, enhancing the mechanistic and global understanding of the model's predictions while maintaining competitive performance and scalability without extensive human annotation.
Related papers
- Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning [27.778160315671776]
We show that dictionary features can steer model behavior, elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and are human interpretable.
We show that dictionary features can steer model behavior, elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and are human interpretable.
arXiv Detail & Related papers (2024-10-31T19:39:40Z) - Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
Interpretability: A Case Study on Othello-GPT [59.245414547751636]
We propose a circuit discovery framework alternative to activation patching.
Our framework suffers less from out-of-distribution and proves to be more efficient in terms of complexity.
We dig in a small transformer trained on a synthetic task named Othello and find a number of human-understandable fine-grained circuits inside of it.
arXiv Detail & Related papers (2024-02-19T15:04:53Z) - Improving Large-Scale k-Nearest Neighbor Text Categorization with Label
Autoencoders [0.0]
We introduce a multi-label lazy learning approach to deal with automatic semantic indexing in large document collections.
The proposed method is an evolution of the traditional k-Nearest Neighbors algorithm.
We have evaluated our proposal in a large portion of the MEDLINE biomedical document collection.
arXiv Detail & Related papers (2024-02-03T00:11:29Z) - Representing visual classification as a linear combination of words [0.0]
We present an explainability strategy that uses a vision-language model to identify language-based descriptors of a visual classification task.
By leveraging a pre-trained joint embedding space between images and text, our approach estimates a new classification task as a linear combination of words.
We find that the resulting descriptors largely align with clinical knowledge despite a lack of domain-specific language training.
arXiv Detail & Related papers (2023-11-18T02:00:20Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Biomedical Named Entity Recognition via Dictionary-based Synonym
Generalization [51.89486520806639]
We propose a novel Synonym Generalization (SynGen) framework that recognizes the biomedical concepts contained in the input text using span-based predictions.
We extensively evaluate our approach on a wide range of benchmarks and the results verify that SynGen outperforms previous dictionary-based models by notable margins.
arXiv Detail & Related papers (2023-05-22T14:36:32Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Token Classification for Disambiguating Medical Abbreviations [0.0]
Abbreviations are unavoidable yet critical parts of the medical text.
Lack of a standardized mapping system makes disambiguating abbreviations a difficult and time-consuming task.
arXiv Detail & Related papers (2022-10-05T18:06:49Z) - Measuring the Interpretability of Unsupervised Representations via
Quantized Reverse Probing [97.70862116338554]
We investigate the problem of measuring interpretability of self-supervised representations.
We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts.
We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability.
arXiv Detail & Related papers (2022-09-07T16:18:50Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z) - Seeing The Whole Patient: Using Multi-Label Medical Text Classification
Techniques to Enhance Predictions of Medical Codes [2.158285012874102]
We present results of multi-label medical text classification problems with 18, 50 and 155 labels.
For imbalanced data we show that labels which occur infrequently, benefit the most from additional features incorporated in embeddings.
High dimensional embeddings from this research are made available for public use.
arXiv Detail & Related papers (2020-03-29T02:19:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.