Dialogue-Contextualized Re-ranking for Medical History-Taking
- URL: http://arxiv.org/abs/2304.01974v1
- Date: Tue, 4 Apr 2023 17:31:32 GMT
- Title: Dialogue-Contextualized Re-ranking for Medical History-Taking
- Authors: Jian Zhu, Ilya Valmianski, Anitha Kannan
- Abstract summary: We present a two-stage re-ranking approach that helps close the training-inference gap by re-ranking the first-stage question candidates.
We find that relative to the expert system, the best performance is achieved by our proposed global re-ranker with a transformer backbone.
- Score: 5.039849340960835
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-driven medical history-taking is an important component in symptom
checking, automated patient intake, triage, and other AI virtual care
applications. As history-taking is extremely varied, machine learning models
require a significant amount of data to train. To overcome this challenge,
existing systems are developed using indirect data or expert knowledge. This
leads to a training-inference gap as models are trained on different kinds of
data than what they observe at inference time. In this work, we present a
two-stage re-ranking approach that helps close the training-inference gap by
re-ranking the first-stage question candidates using a dialogue-contextualized
model. For this, we propose a new model, global re-ranker, which cross-encodes
the dialogue with all questions simultaneously, and compare it with several
existing neural baselines. We test both transformer and S4-based language model
backbones. We find that relative to the expert system, the best performance is
achieved by our proposed global re-ranker with a transformer backbone,
resulting in a 30% higher normalized discount cumulative gain (nDCG) and a 77%
higher mean average precision (mAP).
Related papers
- Utilizing Machine Learning and 3D Neuroimaging to Predict Hearing Loss: A Comparative Analysis of Dimensionality Reduction and Regression Techniques [0.0]
We have explored machine learning approaches for predicting hearing loss thresholds on the brain's gray matter 3D images.
In the first phase, we used a 3D CNN model to reduce high-dimensional input into latent space.
In the second phase, we utilized this model to reduce input into rich features.
arXiv Detail & Related papers (2024-04-30T18:39:41Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - MADS: Modulated Auto-Decoding SIREN for time series imputation [9.673093148930874]
We propose MADS, a novel auto-decoding framework for time series imputation, built upon implicit neural representations.
We evaluate our model on two real-world datasets, and show that it outperforms state-of-the-art methods for time series imputation.
arXiv Detail & Related papers (2023-07-03T09:08:47Z) - Learning towards Selective Data Augmentation for Dialogue Generation [52.540330534137794]
We argue that not all cases are beneficial for augmentation task, and the cases suitable for augmentation should obey the following two attributes.
We propose a Selective Data Augmentation framework (SDA) for the response generation task.
arXiv Detail & Related papers (2023-03-17T01:26:39Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - Data-Efficient Methods for Dialogue Systems [4.061135251278187]
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa.
Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts.
In this thesis, we introduce a series of methods for training robust dialogue systems from minimal data.
arXiv Detail & Related papers (2020-12-05T02:51:09Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Data augmentation using generative networks to identify dementia [20.137419355252362]
We show that generative models can be used as an effective approach for data augmentation.
In this paper, we investigate the application of a similar approach to different types of speech and audio-based features extracted from our automatic dementia detection system.
arXiv Detail & Related papers (2020-04-13T15:05:24Z) - The World is Not Binary: Learning to Rank with Grayscale Data for
Dialogue Response Selection [55.390442067381755]
We show that grayscale data can be automatically constructed without human effort.
Our method employs off-the-shelf response retrieval models and response generation models as automatic grayscale data generators.
Experiments on three benchmark datasets and four state-of-the-art matching models show that the proposed approach brings significant and consistent performance improvements.
arXiv Detail & Related papers (2020-04-06T06:34:54Z) - Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation [77.62366712130196]
We present the winning entry at the fast domain adaptation task of DSTC8, a hybrid generative-retrieval model based on GPT-2 fine-tuned to the multi-domain MetaLWOz dataset.
Our model uses retrieval logic as a fallback, being SoTA on MetaLWOz in human evaluation (>4% improvement over the 2nd place system) and attaining competitive generalization performance in adaptation to the unseen MultiWOZ dataset.
arXiv Detail & Related papers (2020-03-03T18:07:42Z) - An Efficient Method of Training Small Models for Regression Problems
with Knowledge Distillation [1.433758865948252]
We propose a new formalism of knowledge distillation for regression problems.
First, we propose a new loss function, teacher outlier loss rejection, which rejects outliers in training samples using teacher model predictions.
By considering the multi-task network, training of the feature extraction of student models becomes more effective.
arXiv Detail & Related papers (2020-02-28T08:46:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.