Related papers: Performance Gains of LLMs With Humans in a World of LLMs Versus Humans

Performance Gains of LLMs With Humans in a World of LLMs Versus Humans

URL: http://arxiv.org/abs/2505.08902v1
Date: Tue, 13 May 2025 18:44:22 GMT
Title: Performance Gains of LLMs With Humans in a World of LLMs Versus Humans
Authors: Lucas McCullum, Pelagie Ami Agassi, Leo Anthony Celi, Daniel K. Ebner, Chrystinne Oliveira Fernandes, Rachel S. Hicklen, Mkliwa Koumbia, Lisa Soleymani Lehmann, David Restrepo,
Abstract summary: Currently, a considerable research effort is devoted to comparing LLMs to a group of human experts.<n>Without proper safeguards in place, LLMs will threaten to cause harm to the established structure of safe delivery of patient care.
Score: 1.12376792916275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Currently, a considerable research effort is devoted to comparing LLMs to a group of human experts, where the term "expert" is often ill-defined or variable, at best, in a state of constantly updating LLM releases. Without proper safeguards in place, LLMs will threaten to cause harm to the established structure of safe delivery of patient care which has been carefully developed throughout history to keep the safety of the patient at the forefront. A key driver of LLM innovation is founded on community research efforts which, if continuing to operate under "humans versus LLMs" principles, will expedite this trend. Therefore, research efforts moving forward must focus on effectively characterizing the safe use of LLMs in clinical settings that persist across the rapid development of novel LLM models. In this communication, we demonstrate that rather than comparing LLMs to humans, there is a need to develop strategies enabling efficient work of humans with LLMs in an almost symbiotic manner.

Related papers

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z)
Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models [50.16340812031201]
We show that large language models (LLMs) do not update their beliefs as expected from the Bayesian framework.<n>We teach the LLMs to reason in a Bayesian manner by training them to mimic the predictions of an optimal Bayesian model.
arXiv Detail & Related papers (2025-03-21T20:13:04Z)
Large Language Model Supply Chain: Open Problems From the Security Perspective [25.320736806895976]
Large Language Model (LLM) is changing the software development paradigm and has gained huge attention from both academia and industry. We take the first step to discuss the potential security risks in each component as well as the integration between components of LLM SC.
arXiv Detail & Related papers (2024-11-03T15:20:21Z)
Improve Temporal Awareness of LLMs for Sequential Recommendation [61.723928508200196]
Large language models (LLMs) have demonstrated impressive zero-shot abilities in solving a wide range of general-purpose tasks. LLMs fall short in recognizing and utilizing temporal information, rendering poor performance in tasks that require an understanding of sequential data. We propose three prompting strategies to exploit temporal information within historical interactions for LLM-based sequential recommendation.
arXiv Detail & Related papers (2024-05-05T00:21:26Z)
Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Healthcare Professionals [1.6574413179773761]
This paper explores the evolving relationship between clinician trust in LLMs and the impact of data sources from predominantly human-generated to AI-generated content. One of the primary concerns identified is the potential feedback loop that arises as LLMs become more reliant on their outputs for learning. A key takeaway from our investigation is the critical role of user expertise and the necessity for a discerning approach to trusting and validating LLM outputs.
arXiv Detail & Related papers (2024-03-15T04:04:45Z)
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation [66.01754585188739]
Large Language Models (LLMs) have been found to have difficulty knowing they do not possess certain knowledge. Retrieval Augmentation (RA) has been extensively studied to mitigate LLMs' hallucinations. We propose several methods to enhance LLMs' perception of knowledge boundaries and show that they are effective in reducing overconfidence.
arXiv Detail & Related papers (2024-02-18T04:57:19Z)
Prevalence and prevention of large language model use in crowd work [11.554258761785512]
We show that the use of large language models (LLMs) is prevalent among crowd workers. We show that targeted mitigation strategies can significantly reduce, but not eliminate, LLM use.
arXiv Detail & Related papers (2023-10-24T09:52:09Z)
Deception Abilities Emerged in Large Language Models [0.0]
Large language models (LLMs) are currently at the forefront of intertwining artificial intelligence (AI) systems with human communication and everyday life. This study reveals that such strategies emerged in state-of-the-art LLMs, such as GPT-4, but were non-existent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents.
arXiv Detail & Related papers (2023-07-31T09:27:01Z)
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap [46.98582021477066]
The rise of powerful large language models (LLMs) brings about tremendous opportunities for innovation but also looming risks for individuals and society at large. We have reached a pivotal moment for ensuring that LLMs and LLM-infused applications are developed and deployed responsibly. It is paramount to pursue new approaches to provide transparency for LLMs.
arXiv Detail & Related papers (2023-06-02T22:51:26Z)
On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.