Related papers: Scorecards for Synthetic Medical Data Evaluation and Reporting

Related papers

Leveraging Generative AI Through Prompt Engineering and Rigorous Validation to Create Comprehensive Synthetic Datasets for AI Training in Healthcare [0.0]
The GPT-4 API was employed to generate high-quality synthetic datasets aimed at overcoming this limitation. The generated data encompassed a comprehensive array of patient admission information, including healthcare provider details, hospital departments, wards, bed assignments, patient demographics, emergency contacts, vital signs, immunizations, allergies, medical histories, appointments, hospital visits, laboratory tests, diagnoses, treatment plans, medications, clinical notes, visit logs, discharge summaries, and referrals.
arXiv Detail & Related papers (2025-04-29T16:37:34Z)
An Integrated Approach to AI-Generated Content in e-health [0.0]
We propose an end-to-end class-conditioned framework to generate synthetic medical images and text data. Our framework integrates Diffusion and Large Language Models (LLMs) to generate data that closely match real-world patterns.
arXiv Detail & Related papers (2025-01-18T14:35:29Z)
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets. Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z)
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI [67.09501109871351]
Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals. GMAI-MMBench is the most comprehensive general medical AI benchmark with well-categorized data structure and multi-perceptual granularity to date. It is constructed from 284 datasets across 38 medical image modalities, 18 clinical-related tasks, 18 departments, and 4 perceptual granularities in a Visual Question Answering (VQA) format.
arXiv Detail & Related papers (2024-08-06T17:59:21Z)
Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges [2.1835659964186087]
This paper presents a systematic review of generative models used to synthesize various medical data types. Our study encompasses a broad array of medical data modalities and explores various generative models.
arXiv Detail & Related papers (2024-06-27T14:00:11Z)
RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore) RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions. Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z)
A Comprehensive Survey on Evaluating Large Language Model Applications in the Medical Industry [2.1717945745027425]
Large Language Models (LLMs) have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. This comprehensive survey delineates the extensive application and requisite evaluation of LLMs within healthcare. Our survey is structured to provide an in-depth analysis of LLM applications across clinical settings, medical text data processing, research, education, and public health awareness.
arXiv Detail & Related papers (2024-04-24T09:55:24Z)
A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Models [3.672850225066168]
generative AI and large language models (LLMs) have opened up new avenues for producing synthetic data. Despite the potential benefits, concerns regarding privacy leakage have surfaced. We introduce SynEval, an open-source evaluation framework designed to assess the fidelity, utility, and privacy preservation of synthetically generated tabular data.
arXiv Detail & Related papers (2024-04-20T08:08:28Z)
The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review [0.0]
Development of trustworthy AI is especially important in medicine. We focus on the importance of data quality (training/test) in deep learning (DL) We propose the METRIC-framework, a specialised data quality framework for medical training data.
arXiv Detail & Related papers (2024-02-21T09:15:46Z)
Can I trust my fake data -- A comprehensive quality assessment framework for synthetic tabular data in healthcare [33.855237079128955]
In response to privacy concerns and regulatory requirements, using synthetic data has been suggested. We present a conceptual framework for quality assurance of SD for AI applications in healthcare. We propose stages necessary to support real-life applications.
arXiv Detail & Related papers (2024-01-24T08:14:20Z)
Learning Evaluation Models from Large Language Models for Sequence Generation [61.8421748792555]
We propose a three-stage evaluation model training method that utilizes large language models to generate labeled data for model-based metric development. Experimental results on the SummEval benchmark demonstrate that CSEM can effectively train an evaluation model without human-labeled data.
arXiv Detail & Related papers (2023-08-08T16:41:16Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Evaluation of the Synthetic Electronic Health Records [3.255030588361125]
This work outlines two metrics called Similarity and Uniqueness for sample-wise assessment of synthetic datasets. We demonstrate the proposed notions with several state-of-the-art generative models to synthesise Cystic Fibrosis (CF) patients' electronic health records.
arXiv Detail & Related papers (2022-10-16T22:46:08Z)
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models [15.165156674288623]
We introduce a generalizable benchmarking framework to appraise key characteristics of synthetic health data. Results show that there is a utility-privacy tradeoff for sharing synthetic EHR data.
arXiv Detail & Related papers (2022-08-02T03:44:45Z)
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation [110.31526448744096]
We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. We are building MedPerf, an open framework for benchmarking machine learning in the medical domain.
arXiv Detail & Related papers (2021-09-29T18:09:41Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
Privacy-preserving medical image analysis [53.4844489668116]
We present PriMIA, a software framework designed for privacy-preserving machine learning (PPML) in medical imaging. We show significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets. We empirically evaluate the framework's security against a gradient-based model inversion attack.
arXiv Detail & Related papers (2020-12-10T13:56:00Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.