Auditing large language models: a three-layered approach
- URL: http://arxiv.org/abs/2302.08500v2
- Date: Tue, 27 Jun 2023 07:40:15 GMT
- Title: Auditing large language models: a three-layered approach
- Authors: Jakob M\"okander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Abstract summary: Large language models (LLMs) represent a major advance in artificial intelligence (AI) research.
LLMs are also coupled with significant ethical and social challenges.
Previous research has pointed towards auditing as a promising governance mechanism.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) represent a major advance in artificial
intelligence (AI) research. However, the widespread use of LLMs is also coupled
with significant ethical and social challenges. Previous research has pointed
towards auditing as a promising governance mechanism to help ensure that AI
systems are designed and deployed in ways that are ethical, legal, and
technically robust. However, existing auditing procedures fail to address the
governance challenges posed by LLMs, which display emergent capabilities and
are adaptable to a wide range of downstream tasks. In this article, we address
that gap by outlining a novel blueprint for how to audit LLMs. Specifically, we
propose a three-layered approach, whereby governance audits (of technology
providers that design and disseminate LLMs), model audits (of LLMs after
pre-training but prior to their release), and application audits (of
applications based on LLMs) complement and inform each other. We show how
audits, when conducted in a structured and coordinated manner on all three
levels, can be a feasible and effective mechanism for identifying and managing
some of the ethical and social risks posed by LLMs. However, it is important to
remain realistic about what auditing can reasonably be expected to achieve.
Therefore, we discuss the limitations not only of our three-layered approach
but also of the prospect of auditing LLMs at all. Ultimately, this article
seeks to expand the methodological toolkit available to technology providers
and policymakers who wish to analyse and evaluate LLMs from technical, ethical,
and legal perspectives.
Related papers
- AuditWen:An Open-Source Large Language Model for Audit [20.173039073935907]
This study introduces AuditWen, an open-source audit LLM by fine-tuning Qwen with constructing instruction data from audit domain.
We propose an audit LLM, called AuditWen, by fine-tuning Qwen with constructing 28k instruction dataset from 15 audit tasks and 3 layers.
In evaluation stage, we proposed a benchmark with 3k instructions that covers a set of critical audit tasks derived from the application scenarios.
The experimental results demonstrate superior performance of AuditWen both in question understanding and answer generation, making it an immediately valuable tool for audit.
arXiv Detail & Related papers (2024-10-09T02:28:55Z) - Control Large Language Models via Divide and Conquer [94.48784966256463]
This paper investigates controllable generation for large language models (LLMs) with prompt-based control, focusing on Lexically Constrained Generation (LCG)
We evaluate the performance of LLMs on satisfying lexical constraints with prompt-based control, as well as their efficacy in downstream applications.
arXiv Detail & Related papers (2024-10-06T21:20:06Z) - A Blueprint for Auditing Generative AI [0.9999629695552196]
generative AI systems display emergent capabilities and are adaptable to a wide range of downstream tasks.
Existing auditing procedures fail to address the governance challenges posed by generative AI systems.
We propose a three-layered approach, whereby governance audits of technology providers that design and disseminate generative AI systems, model audits of generative AI systems after pre-training but prior to their release, and application audits of applications based on top of generative AI systems.
arXiv Detail & Related papers (2024-07-07T11:56:54Z) - A Reality check of the benefits of LLM in business [1.9181612035055007]
Large language models (LLMs) have achieved remarkable performance in language understanding and generation tasks.
This paper thoroughly examines the usefulness and readiness of LLMs for business processes.
arXiv Detail & Related papers (2024-06-09T02:36:00Z) - Large Language Model in Financial Regulatory Interpretation [0.276240219662896]
This study explores the innovative use of Large Language Models (LLMs) as analytical tools for interpreting complex financial regulations.
The primary objective is to design effective prompts that guide LLMs in distilling verbose and intricate regulatory texts.
This novel approach aims to streamline the implementation of regulatory mandates within the financial reporting and risk management systems of global banking institutions.
arXiv Detail & Related papers (2024-05-10T20:45:40Z) - A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law.
We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies.
We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - A Comprehensive Evaluation of Large Language Models on Legal Judgment
Prediction [60.70089334782383]
Large language models (LLMs) have demonstrated great potential for domain-specific applications.
Recent disputes over GPT-4's law evaluation raise questions concerning their performance in real-world legal tasks.
We design practical baseline solutions based on LLMs and test on the task of legal judgment prediction.
arXiv Detail & Related papers (2023-10-18T07:38:04Z) - How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities.
We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.