Related papers: JingFang: A Traditional Chinese Medicine Large Language Model of Expert-Level Medical Diagnosis and Syndrome Differentiation-Based Treatment

Related papers

MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration [57.98393950821579]
We introduce the Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis (MAM)<n>Inspired by our empirical findings, MAM decomposes the medical diagnostic process into specialized roles: a General Practitioner, Specialist Team, Radiologist, Medical Assistant, and Director.<n>This modular and collaborative framework enables efficient knowledge updates and leverages existing medical LLMs and knowledge bases.
arXiv Detail & Related papers (2025-06-24T17:52:43Z)
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making [80.94208848596215]
We present a new concept called Catfish Agent, a role-specialized LLM designed to inject structured dissent and counter silent agreement.<n>Inspired by the catfish effect'' in organizational psychology, the Catfish Agent is designed to challenge emerging consensus to stimulate deeper reasoning.
arXiv Detail & Related papers (2025-05-27T17:59:50Z)
Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice [15.020917068333237]
Tianyi is designed to assimilate interconnected and systematic TCM knowledge through a progressive learning manner.<n>Extensive evaluations demonstrate the significant potential of Tianyi as an AI assistant in TCM clinical practice and research.
arXiv Detail & Related papers (2025-05-19T14:17:37Z)
MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare [5.253666246682483]
We introduce the world's first clinical terminology for the Chinese healthcare community, namely MedCT.<n>The MedCT system enables standardized and programmable representation of Chinese clinical data.<n>We present our approach in sufficient engineering detail, such as implementing a clinical terminology for other non-English societies should be readily reproducible.
arXiv Detail & Related papers (2025-01-11T07:35:51Z)
RareAgents: Autonomous Multi-disciplinary Team for Rare Disease Diagnosis and Treatment [13.330661181655493]
Rare diseases collectively impact around 300 million people worldwide due to the huge number of diseases.<n>Recently, agents powered by large language models (LLMs) have demonstrated notable improvements across various domains.<n>RareAgents integrates advanced planning capabilities, memory mechanisms, and medical tools utilization, leveraging Llama-3.1-8B/70B as the base model.
arXiv Detail & Related papers (2024-12-17T02:22:24Z)
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking [58.25862290294702]
We present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow.<n>We also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses.
arXiv Detail & Related papers (2024-12-02T15:25:02Z)
BianCang: A Traditional Chinese Medicine Large Language Model [22.582027277167047]
BianCang is a TCM-specific large language model (LLMs) that first injects domain-specific knowledge and then aligns it through targeted stimulation. We constructed pre-training corpora, instruction-aligned datasets based on real hospital records, and the ChP-TCM dataset derived from the Pharmacopoeia of the People's Republic of China. We compiled extensive TCM and medical corpora for continuous pre-training and supervised fine-tuning, building a comprehensive dataset to refine the model's understanding of TCM.
arXiv Detail & Related papers (2024-11-17T10:17:01Z)
Intelligent Understanding of Large Language Models in Traditional Chinese Medicine Based on Prompt Engineering Framework [3.990633038739491]
We propose TCM-Prompt, a framework that integrates various pre-trained language models (PLMs), templates, tokenization, and verbalization methods. We conducted experiments on disease classification, syndrome identification, herbal medicine recommendation, and general NLP tasks.
arXiv Detail & Related papers (2024-10-25T10:24:30Z)
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios [50.032101237019205]
CliMedBench is a comprehensive benchmark with 14 expert-guided core clinical scenarios. The reliability of this benchmark has been confirmed in several ways.
arXiv Detail & Related papers (2024-10-04T15:15:36Z)
Building a Chinese Medical Dialogue System: Integrating Large-scale Corpora and Novel Models [2.04367431902848]
The COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services. Existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns. Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations.
arXiv Detail & Related papers (2024-09-27T00:01:32Z)
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules. We develop a medical dialogue dataset comprising rule-based communications between patients and physicians. Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z)
CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation [20.59298361626719]
We propose a chain-of-medical-thought approach (CoMT) to mitigate hallucinations in medical report generation. CoMT intends to imitate the cognitive process of human doctors by decomposing diagnostic procedures.
arXiv Detail & Related papers (2024-06-17T12:03:32Z)
CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making [16.310913127940857]
We introduce CliBench, a novel benchmark developed from the MIMIC IV dataset. This benchmark offers a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis. We conduct a zero-shot evaluation of leading LLMs to assess their proficiency in clinical decision-making.
arXiv Detail & Related papers (2024-06-14T11:10:17Z)
MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework. The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z)
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds [32.99251005719732]
Clinical reasoning refers to the cognitive process that physicians employ in evaluating and managing patients. In this study, we introduce a novel framework, In-Context Padding (ICP), designed to enhance LLMs with medical knowledge.
arXiv Detail & Related papers (2024-03-11T10:53:20Z)
A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models [57.88111980149541]
We introduce Asclepius, a novel Med-MLLM benchmark that assesses Med-MLLMs in terms of distinct medical specialties and different diagnostic capacities.<n>Grounded in 3 proposed core principles, Asclepius ensures a comprehensive evaluation by encompassing 15 medical specialties.<n>We also provide an in-depth analysis of 6 Med-MLLMs and compare them with 3 human specialists.
arXiv Detail & Related papers (2024-02-17T08:04:23Z)
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z)
Beyond Direct Diagnosis: LLM-based Multi-Specialist Agent Consultation for Automatic Diagnosis [30.943705201552643]
We propose a framework to model the diagnosis process in the real world by adaptively fusing probability distributions of agents over potential diseases. Our approach requires significantly less parameter updating and training time, enhancing efficiency and practical utility.
arXiv Detail & Related papers (2024-01-29T12:25:30Z)
MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models [56.36916128631784]
We introduce MedBench, a comprehensive benchmark for the Chinese medical domain. This benchmark is composed of four key components: the Chinese Medical Licensing Examination, the Resident Standardization Training Examination, and real-world clinic cases. We perform extensive experiments and conduct an in-depth analysis from diverse perspectives, which culminate in the following findings.
arXiv Detail & Related papers (2023-12-20T07:01:49Z)
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain. ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF. We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.