MedDialog: Two Large-scale Medical Dialogue Datasets
- URL: http://arxiv.org/abs/2004.03329v2
- Date: Tue, 7 Jul 2020 22:15:10 GMT
- Title: MedDialog: Two Large-scale Medical Dialogue Datasets
- Authors: Xuehai He, Shu Chen, Zeqian Ju, Xiangyu Dong, Hongchao Fang, Sicheng
Wang, Yue Yang, Jiaqi Zeng, Ruisi Zhang, Ruoyu Zhang, Meng Zhou, Penghui Zhu,
Pengtao Xie
- Abstract summary: We build two large-scale medical dialogue datasets: MedDialog-EN and MedDialog-CN.
MedDialog-EN is an English dataset containing 0.3 million conversations between patients and doctors and 0.5 million utterances.
MedDialog-CN is a Chinese dataset containing 1.1 million conversations and 4 million utterances.
- Score: 27.619211324563498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical dialogue systems are promising in assisting in telemedicine to
increase access to healthcare services, improve the quality of patient care,
and reduce medical costs. To facilitate the research and development of medical
dialogue systems, we build two large-scale medical dialogue datasets:
MedDialog-EN and MedDialog-CN. MedDialog-EN is an English dataset containing
0.3 million conversations between patients and doctors and 0.5 million
utterances. MedDialog-CN is an Chinese dataset containing 1.1 million
conversations and 4 million utterances. To our best knowledge,
MedDialog-(EN,CN) are the largest medical dialogue datasets to date. The
dataset is available at https://github.com/UCSD-AI4H/Medical-Dialogue-System
Related papers
- MediTOD: An English Dialogue Dataset for Medical History Taking with Comprehensive Annotations [23.437292621092823]
We introduce MediTOD, a dataset of doctor-patient dialogues in English for the medical history-taking task.
We devise a questionnaire-based labeling scheme tailored to the medical domain.
Then, medical professionals create the dataset with high-quality comprehensive annotations.
arXiv Detail & Related papers (2024-10-18T06:38:22Z) - MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway
Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework.
The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z) - RaDialog: A Large Vision-Language Model for Radiology Report Generation
and Conversational Assistance [53.20640629352422]
Conversational AI tools can generate and discuss clinically correct radiology reports for a given medical image.
RaDialog is the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog.
Our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions.
arXiv Detail & Related papers (2023-11-30T16:28:40Z) - MidMed: Towards Mixed-Type Dialogues for Medical Consultation [12.676937863407542]
Most medical dialogue systems assume that patients have clear goals (medicine querying, surgical operation querying, etc.) before medical consultation.
Due to the lack of medical knowledge, it is usually difficult for patients to determine clear goals with all necessary slots.
We propose a novel task and create a human-to-human mixed-type medical consultation dialogue corpus, termed MidMed.
arXiv Detail & Related papers (2023-06-05T14:36:31Z) - CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware
Dialog Generation [18.047064216849204]
We release a high-quality multi-turn Medical Dialog dataset relating to Covid-19 disease named CDialog.
We propose a novel neural medical dialog system based on the CDialog dataset to advance future research on developing automated medical dialog systems.
arXiv Detail & Related papers (2022-11-16T11:07:34Z) - MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal
Open-domain Conversation [68.53133207668856]
We introduce the MMDialog dataset to better facilitate multi-modal conversation.
MMDialog is composed of a curated set of 1.08 million real-world dialogues with 1.53 million unique images across 4,184 topics.
To build engaging dialogue system with this dataset, we propose and normalize two response producing tasks.
arXiv Detail & Related papers (2022-11-10T17:37:04Z) - A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks
and Datasets [70.32630628211803]
We propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction.
A new large medical dialogue dataset with multi-level fine-grained annotations is introduced.
We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.
arXiv Detail & Related papers (2022-04-19T16:43:21Z) - DialMed: A Dataset for Dialogue-based Medication Recommendation [20.08110449216702]
We make the first attempt to recommend medications with the conversations between doctors and patients.
We construct DialMed, the first high-quality dataset for medical dialogue-based medication recommendation task.
arXiv Detail & Related papers (2022-02-22T05:12:29Z) - M^2-MedDialog: A Dataset and Benchmarks for Multi-domain Multi-service
Medical Dialogues [25.58066103487436]
Medical dialogue systems (MDSs) aim to assist doctors and patients with a range of professional medical services.
No dataset has so large-scale dialogues contains both multiple medical services and fine-grained medical labels.
We first build a Multiple-domain Multiple-service medical dialogue (M2-MedDialog)dataset, which contains 1,557 conversations between doctors and patients.
arXiv Detail & Related papers (2021-09-01T15:24:54Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z) - On the Generation of Medical Dialogues for COVID-19 [60.63485429268256]
People experiencing COVID19-related symptoms or exposed to risk factors have a pressing need to consult doctors.
Because of the shortage of medical professionals, many people cannot receive online consultations timely.
We aim to develop a medical dialogue system that can provide COVID19-related consultations.
arXiv Detail & Related papers (2020-05-11T21:23:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.