The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)
- URL: http://arxiv.org/abs/2403.14473v1
- Date: Thu, 21 Mar 2024 15:20:07 GMT
- Title: The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)
- Authors: Joschka Haltaufderheide, Robert Ranisch,
- Abstract summary: ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare.
Despite their potential benefits, researchers have underscored various ethical implications.
This work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the introduction of ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare. Despite their potential benefits, researchers have underscored various ethical implications. While individual instances have drawn much attention, the debate lacks a systematic overview of practical applications currently researched and ethical issues connected to them. Against this background, this work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare. Electronic databases and preprint servers were queried using a comprehensive search strategy. Studies were screened and extracted following a modified rapid review approach. Methodological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Four fields of applications emerged and testify to a vivid exploration phase. Advantages of using LLMs are attributed to their capacity in data analysis, personalized information provisioning, support in decision-making, mitigating information loss and enhancing information accessibility. However, we also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency, and privacy. A distinctive concern is the tendency to produce harmful misinformation or convincingly but inaccurate content. A recurrent plea for ethical guidance and human oversight is evident. Given the variety of use cases, it is suggested that the ethical guidance debate be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering diverse settings, varying potentials for harm, and different acceptable thresholds for performance and certainty in healthcare. In addition, a critical inquiry is necessary to determine the extent to which the current experimental use of LLMs is necessary and justified.
Related papers
- "It's a conversation, not a quiz": A Risk Taxonomy and Reflection Tool for LLM Adoption in Public Health [16.418366314356184]
We conduct focus groups with health professionals and health issue experiencers to unpack their concerns.
We synthesize participants' perspectives into a risk taxonomy.
This taxonomy highlights four dimensions of risk in individual behaviors, human-centered care, information ecosystem, and technology accountability.
arXiv Detail & Related papers (2024-11-04T20:35:10Z) - Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation [3.328297368052458]
We tackle bias detection in medical curricula using NLP models, including LLMs.
We evaluate them on a gold standard dataset containing 4,105 excerpts annotated by medical experts for bias from a large corpus.
arXiv Detail & Related papers (2024-09-11T17:10:20Z) - Eagle: Ethical Dataset Given from Real Interactions [74.7319697510621]
We create datasets extracted from real interactions between ChatGPT and users that exhibit social biases, toxicity, and immoral problems.
Our experiments show that Eagle captures complementary aspects, not covered by existing datasets proposed for evaluation and mitigation of such ethical challenges.
arXiv Detail & Related papers (2024-02-22T03:46:02Z) - A Comprehensive Survey of Hallucination Mitigation Techniques in Large
Language Models [7.705767540805267]
Large Language Models (LLMs) continue to advance in their ability to write human-like text.
A key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded.
This paper presents a survey of over 32 techniques developed to mitigate hallucination in LLMs.
arXiv Detail & Related papers (2024-01-02T17:56:30Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Self-Diagnosis and Large Language Models: A New Front for Medical
Misinformation [8.738092015092207]
We evaluate the capabilities of large language models (LLMs) from the lens of a general user self-diagnosing.
We develop a testing methodology which can be used to evaluate responses to open-ended questions mimicking real-world use cases.
We reveal that a) these models perform worse than previously known, and b) they exhibit peculiar behaviours, including overconfidence when stating incorrect recommendations.
arXiv Detail & Related papers (2023-07-10T21:28:26Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Privacy-preserving medical image analysis [53.4844489668116]
We present PriMIA, a software framework designed for privacy-preserving machine learning (PPML) in medical imaging.
We show significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets.
We empirically evaluate the framework's security against a gradient-based model inversion attack.
arXiv Detail & Related papers (2020-12-10T13:56:00Z) - Towards Ethics by Design in Online Abusive Content Detection [7.163723138100273]
The research effort has spread out across several closely related sub-areas, such as detection of hate speech, toxicity, cyberbullying, etc.
We bring ethical issues to forefront and propose a unified framework as a two-step process.
The novel framework is guided by the Ethics by Design principle and is a step towards building more accurate and trusted models.
arXiv Detail & Related papers (2020-10-28T13:10:24Z) - Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state.
The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.