Auditing large language models: a three-layered approach
- URL: http://arxiv.org/abs/2302.08500v2
- Date: Tue, 27 Jun 2023 07:40:15 GMT
- Title: Auditing large language models: a three-layered approach
- Authors: Jakob M\"okander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Abstract summary: Large language models (LLMs) represent a major advance in artificial intelligence (AI) research.
LLMs are also coupled with significant ethical and social challenges.
Previous research has pointed towards auditing as a promising governance mechanism.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) represent a major advance in artificial
intelligence (AI) research. However, the widespread use of LLMs is also coupled
with significant ethical and social challenges. Previous research has pointed
towards auditing as a promising governance mechanism to help ensure that AI
systems are designed and deployed in ways that are ethical, legal, and
technically robust. However, existing auditing procedures fail to address the
governance challenges posed by LLMs, which display emergent capabilities and
are adaptable to a wide range of downstream tasks. In this article, we address
that gap by outlining a novel blueprint for how to audit LLMs. Specifically, we
propose a three-layered approach, whereby governance audits (of technology
providers that design and disseminate LLMs), model audits (of LLMs after
pre-training but prior to their release), and application audits (of
applications based on LLMs) complement and inform each other. We show how
audits, when conducted in a structured and coordinated manner on all three
levels, can be a feasible and effective mechanism for identifying and managing
some of the ethical and social risks posed by LLMs. However, it is important to
remain realistic about what auditing can reasonably be expected to achieve.
Therefore, we discuss the limitations not only of our three-layered approach
but also of the prospect of auditing LLMs at all. Ultimately, this article
seeks to expand the methodological toolkit available to technology providers
and policymakers who wish to analyse and evaluate LLMs from technical, ethical,
and legal perspectives.
Related papers
- A Blueprint for Auditing Generative AI [0.9999629695552196]
generative AI systems display emergent capabilities and are adaptable to a wide range of downstream tasks.
Existing auditing procedures fail to address the governance challenges posed by generative AI systems.
We propose a three-layered approach, whereby governance audits of technology providers that design and disseminate generative AI systems, model audits of generative AI systems after pre-training but prior to their release, and application audits of applications based on top of generative AI systems.
arXiv Detail & Related papers (2024-07-07T11:56:54Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks.
We propose a text-based generative IoT (GIoT) system deployed in the local network setting.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - A Reality check of the benefits of LLM in business [1.9181612035055007]
Large language models (LLMs) have achieved remarkable performance in language understanding and generation tasks.
This paper thoroughly examines the usefulness and readiness of LLMs for business processes.
arXiv Detail & Related papers (2024-06-09T02:36:00Z) - Large Language Model in Financial Regulatory Interpretation [0.276240219662896]
This study explores the innovative use of Large Language Models (LLMs) as analytical tools for interpreting complex financial regulations.
The primary objective is to design effective prompts that guide LLMs in distilling verbose and intricate regulatory texts.
This novel approach aims to streamline the implementation of regulatory mandates within the financial reporting and risk management systems of global banking institutions.
arXiv Detail & Related papers (2024-05-10T20:45:40Z) - A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law.
We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies.
We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - A Comprehensive Evaluation of Large Language Models on Legal Judgment
Prediction [60.70089334782383]
Large language models (LLMs) have demonstrated great potential for domain-specific applications.
Recent disputes over GPT-4's law evaluation raise questions concerning their performance in real-world legal tasks.
We design practical baseline solutions based on LLMs and test on the task of legal judgment prediction.
arXiv Detail & Related papers (2023-10-18T07:38:04Z) - PRISMA-DFLLM: An Extension of PRISMA for Systematic Literature Reviews
using Domain-specific Finetuned Large Language Models [0.0]
This paper proposes an AI-enabled methodological framework that combines the power of Large Language Models (LLMs) with the rigorous reporting guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)
By finetuning LLMs on domain-specific academic papers that have been selected as a result of a rigorous SLR process, the proposed PRISMA-DFLLM reporting guidelines offer the potential to achieve greater efficiency, reusability and scalability.
arXiv Detail & Related papers (2023-06-15T02:52:50Z) - How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities.
We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.