Walking a Tightrope -- Evaluating Large Language Models in High-Risk
Domains
- URL: http://arxiv.org/abs/2311.14966v1
- Date: Sat, 25 Nov 2023 08:58:07 GMT
- Title: Walking a Tightrope -- Evaluating Large Language Models in High-Risk
Domains
- Authors: Chia-Chien Hung, Wiem Ben Rim, Lindsay Frost, Lars Bruckner, Carolin
Lawrence
- Abstract summary: High-risk domains pose unique challenges that require language models to provide accurate and safe responses.
Despite the great success of large language models (LLMs), their performance in high-risk domains remains unclear.
- Score: 15.320563604087246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-risk domains pose unique challenges that require language models to
provide accurate and safe responses. Despite the great success of large
language models (LLMs), such as ChatGPT and its variants, their performance in
high-risk domains remains unclear. Our study delves into an in-depth analysis
of the performance of instruction-tuned LLMs, focusing on factual accuracy and
safety adherence. To comprehensively assess the capabilities of LLMs, we
conduct experiments on six NLP datasets including question answering and
summarization tasks within two high-risk domains: legal and medical. Further
qualitative analysis highlights the existing limitations inherent in current
LLMs when evaluating in high-risk domains. This underscores the essential
nature of not only improving LLM capabilities but also prioritizing the
refinement of domain-specific metrics, and embracing a more human-centric
approach to enhance safety and factual reliability. Our findings advance the
field toward the concerns of properly evaluating LLMs in high-risk domains,
aiming to steer the adaptability of LLMs in fulfilling societal obligations and
aligning with forthcoming regulations, such as the EU AI Act.
Related papers
- A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy [31.839815402460918]
Large language models (LLMs) present significant potential for supporting numerous real-world applications.
They still face significant challenges in terms of the inherent risk of privacy leakage, hallucinated outputs, and value misalignment.
arXiv Detail & Related papers (2025-01-16T09:59:45Z) - Large Language Model Safety: A Holistic Survey [35.42419096859496]
The rapid development and deployment of large language models (LLMs) have introduced a new frontier in artificial intelligence.
This survey provides a comprehensive overview of the current landscape of LLM safety, covering four major categories: value misalignment, robustness to adversarial attacks, misuse, and autonomous AI risks.
arXiv Detail & Related papers (2024-12-23T16:11:27Z) - UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models [41.67393607081513]
Large Language Models (LLMs) often struggle to accurately express the factual knowledge they possess.
We propose the UAlign framework, which leverages Uncertainty estimations to represent knowledge boundaries.
We show that the proposed UAlign can significantly enhance the LLMs' capacities to confidently answer known questions.
arXiv Detail & Related papers (2024-12-16T14:14:27Z) - Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents.
We identify six key features of LLM-based agents, based on which we summarize the current research progress.
We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z) - Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play [0.43512163406552007]
As Large Language Models (LLMs) become more prevalent, concerns about their safety, ethics, and potential biases have risen.
This study innovatively applies the Domain-Specific Risk-Taking (DOSPERT) scale from cognitive science to LLMs.
We propose a novel Ethical Decision-Making Risk Attitude Scale (EDRAS) to assess LLMs' ethical risk attitudes in depth.
arXiv Detail & Related papers (2024-10-26T15:55:21Z) - SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models [75.67623347512368]
We propose toolns, a comprehensive framework designed for conducting safety evaluations of MLLMs.
Our framework consists of a comprehensive harmful query dataset and an automated evaluation protocol.
Based on our framework, we conducted large-scale experiments on 15 widely-used open-source MLLMs and 6 commercial MLLMs.
arXiv Detail & Related papers (2024-10-24T17:14:40Z) - A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law.
We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies.
We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines.
While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety.
This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z) - A Formalism and Approach for Improving Robustness of Large Language
Models Using Risk-Adjusted Confidence Scores [4.043005183192123]
Large Language Models (LLMs) have achieved impressive milestones in natural language processing (NLP)
Despite their impressive performance, the models are known to pose important risks.
We define and formalize two distinct types of risk: decision risk and composite risk.
arXiv Detail & Related papers (2023-10-05T03:20:41Z) - Safety Assessment of Chinese Large Language Models [51.83369778259149]
Large language models (LLMs) may generate insulting and discriminatory content, reflect incorrect social values, and may be used for malicious purposes.
To promote the deployment of safe, responsible, and ethical AI, we release SafetyPrompts including 100k augmented prompts and responses by LLMs.
arXiv Detail & Related papers (2023-04-20T16:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.