Large Language Model Alignment: A Survey
- URL: http://arxiv.org/abs/2309.15025v1
- Date: Tue, 26 Sep 2023 15:49:23 GMT
- Title: Large Language Model Alignment: A Survey
- Authors: Tianhao Shen, Renren Jin, Yufei Huang, Chuang Liu, Weilong Dong,
Zishan Guo, Xinwei Wu, Yan Liu, Deyi Xiong
- Abstract summary: The potential of large language models (LLMs) is undeniably vast; however, they may yield texts that are imprecise, misleading, or even detrimental.
This survey endeavors to furnish an extensive exploration of alignment methodologies designed for LLMs.
We also probe into salient issues including the models' interpretability, and potential vulnerabilities to adversarial attacks.
- Score: 42.03229317132863
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent years have witnessed remarkable progress made in large language models
(LLMs). Such advancements, while garnering significant attention, have
concurrently elicited various concerns. The potential of these models is
undeniably vast; however, they may yield texts that are imprecise, misleading,
or even detrimental. Consequently, it becomes paramount to employ alignment
techniques to ensure these models to exhibit behaviors consistent with human
values.
This survey endeavors to furnish an extensive exploration of alignment
methodologies designed for LLMs, in conjunction with the extant capability
research in this domain. Adopting the lens of AI alignment, we categorize the
prevailing methods and emergent proposals for the alignment of LLMs into outer
and inner alignment. We also probe into salient issues including the models'
interpretability, and potential vulnerabilities to adversarial attacks. To
assess LLM alignment, we present a wide variety of benchmarks and evaluation
methodologies. After discussing the state of alignment research for LLMs, we
finally cast a vision toward the future, contemplating the promising avenues of
research that lie ahead.
Our aspiration for this survey extends beyond merely spurring research
interests in this realm. We also envision bridging the gap between the AI
alignment research community and the researchers engrossed in the capability
exploration of LLMs for both capable and safe LLMs.
Related papers
- From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems.
The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness.
This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z) - Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey [18.570066068280212]
Large Language Models (LLMs) have demonstrated their effectiveness not only in natural language processing but also in broader applications.
This survey focuses on the problem of anomaly and OOD detection under the context of LLMs.
We propose a new taxonomy to categorize existing approaches into two classes based on the role played by LLMs.
arXiv Detail & Related papers (2024-09-03T15:22:41Z) - AI Safety in Generative AI Large Language Models: A Survey [14.737084887928408]
Large Language Model (LLMs) that exhibit generative AI capabilities are facing accelerated adoption and innovation.
Generative AI (GAI) inevitably raises concerns about the risks and safety associated with these models.
This article provides an up-to-date survey of recent trends in AI safety research of GAI-LLMs from a computer scientist's perspective.
arXiv Detail & Related papers (2024-07-06T09:00:18Z) - A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law.
We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies.
We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z) - A Survey of Confidence Estimation and Calibration in Large Language Models [86.692994151323]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains.
Despite their impressive performance, they can be unreliable due to factual errors in their generations.
Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations.
arXiv Detail & Related papers (2023-11-14T16:43:29Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z) - A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.