Related papers: MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations

MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations

URL: http://arxiv.org/abs/2310.12489v3
Date: Fri, 12 Jan 2024 06:24:21 GMT
Title: MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations
Authors: Olumide E. Ojo, Olaronke O. Adebanji, Alexander Gelbukh, Hiram Calvo, Anna Feldman
Abstract summary: Zero-shot classification enables text to be classified into classes not seen during training. The models evaluated include BART, BERT, XLM, XLM-R and DistilBERT. The zero-shot language models show a good understanding of language generally, but has limitations when trying to classify doctor and AI responses to healthcare consultations.
Score: 44.669251100016986
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Zero-shot classification enables text to be classified into classes not seen during training. In this study, we examine the efficacy of zero-shot learning models in classifying healthcare consultation responses from Doctors and AI systems. The models evaluated include BART, BERT, XLM, XLM-R and DistilBERT. The models were tested on three different datasets based on a binary and multi-label analysis to identify the origins of text in health consultations without any prior corpus training. According to our findings, the zero-shot language models show a good understanding of language generally, but has limitations when trying to classify doctor and AI responses to healthcare consultations. This research provides a foundation for future research in the field of medical text classification by informing the development of more accurate methods of classifying text written by Doctors and AI systems in health consultations.

Related papers

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation [58.25892575437433]
evaluating large language models (LLMs) in medicine is crucial because medical applications require high accuracy with little room for error.<n>We present LLMEval-Med, a new benchmark covering five core medical areas, including 2,996 questions created from real-world electronic health records and expert-designed clinical scenarios.
arXiv Detail & Related papers (2025-06-04T15:43:14Z)
Large Language Models for Healthcare Text Classification: A Systematic Review [4.8342038441006805]
Large Language Models (LLMs) have fundamentally transformed approaches to Natural Language Processing (NLP) In healthcare, accurate and cost-efficient text classification is crucial, whether for clinical notes analysis, diagnosis coding, or any other task. Numerous studies have been conducted to leverage LLMs for automated healthcare text classification.
arXiv Detail & Related papers (2025-03-03T04:16:13Z)
Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation [0.0]
Large language models (LLMs) have shown impressive capabilities in natural language processing tasks, including dialogue generation. This research aims to conduct a novel comparative analysis of two prominent techniques, fine-tuning with LoRA and the Retrieval-Augmented Generation framework.
arXiv Detail & Related papers (2025-02-04T11:50:40Z)
AI in radiological imaging of soft-tissue and bone tumours: a systematic review evaluating against CLAIM and FUTURE-AI guidelines [1.5332408886895255]
Soft-tissue and bone tumours (STBT) are rare, diagnostically challenging lesions with variable clinical behaviours and treatment approaches. This systematic review provides an overview of Artificial Intelligence (AI) methods using radiological imaging for diagnosis and prognosis of these tumours.
arXiv Detail & Related papers (2024-08-22T15:31:48Z)
Can Generative AI Support Patients' & Caregivers' Informational Needs? Towards Task-Centric Evaluation Of AI Systems [0.7124736158080937]
We develop an evaluation paradigm that centers human understanding and decision-making. We study the utility of generative AI systems in supporting people in a concrete task. We evaluate two state-of-the-art generative AI systems against the radiologist's responses.
arXiv Detail & Related papers (2024-01-31T23:24:37Z)
Explanatory Argument Extraction of Correct Answers in Resident Medical Exams [5.399800035598185]
We present a new dataset which includes not only explanatory arguments for the correct answer, but also arguments to reason why the incorrect answers are not correct. This new benchmark allows us to setup a novel extractive task which consists of identifying the explanation of the correct answer written by medical doctors.
arXiv Detail & Related papers (2023-12-01T13:22:35Z)
Designing Interpretable ML System to Enhance Trust in Healthcare: A Systematic Review to Proposed Responsible Clinician-AI-Collaboration Framework [13.215318138576713]
The paper reviews interpretable AI processes, methods, applications, and the challenges of implementation in healthcare. It aims to foster a comprehensive understanding of the crucial role of a robust interpretability approach in healthcare.
arXiv Detail & Related papers (2023-11-18T12:29:18Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records. The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z)
Towards more patient friendly clinical notes through language models and ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling. We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians. Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z)
CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark. It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification. We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z)
Automated Lay Language Summarization of Biomedical Scientific Reviews [16.01452242066412]
Health literacy has emerged as a crucial factor in making appropriate health decisions and ensuring treatment outcomes. Medical jargon and the complex structure of professional language in this domain make health information especially hard to interpret. This paper introduces the novel task of automated generation of lay language summaries of biomedical scientific reviews.
arXiv Detail & Related papers (2020-12-23T10:01:18Z)
MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG. We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation. Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.