Related papers: On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

URL: http://arxiv.org/abs/2403.04204v1
Date: Thu, 7 Mar 2024 04:19:13 GMT
Title: On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Authors: Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, Jing Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie
Abstract summary: Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy.
Score: 77.86952307745763
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable oversight, and how to align remains an open question. In this survey paper, we comprehensively investigate value alignment approaches. We first unpack the historical context of alignment tracing back to the 1920s (where it comes from), then delve into the mathematical essence of alignment (what it is), shedding light on the inherent challenges. Following this foundation, we provide a detailed examination of existing alignment methods, which fall into three categories: Reinforcement Learning, Supervised Fine-Tuning, and In-context Learning, and demonstrate their intrinsic connections, strengths, and limitations, helping readers better understand this research area. In addition, two emerging topics, personal alignment, and multimodal alignment, are also discussed as novel frontiers in this field. Looking forward, we discuss potential alignment paradigms and how they could handle remaining challenges, prospecting where future alignment will go.

Related papers

The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment [33.27140396561271]
The emergence of large language models (LLMs) has sparked the possibility of about Artificial Superintelligence (ASI) Superalignment aims to address two primary goals -- scalability in supervision to provide high-quality guidance signals and robust governance to ensure alignment with human values. Specifically, we explore the concept of ASI, the challenges it poses, and the limitations of current alignment paradigms in addressing the superalignment problem.
arXiv Detail & Related papers (2024-12-21T03:51:04Z)
The Superalignment of Superhuman Intelligence with Large Language Models [63.96120398355404]
We discuss the concept of superalignment from the learning perspective to answer this question. We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation. We present a conceptual framework for superalignment, which consists of three modules: an attacker which generates adversary queries trying to expose the weaknesses of a learner model; a learner which will refine itself by learning from scalable feedbacks generated by a critic model along with minimal human experts; and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing.
arXiv Detail & Related papers (2024-12-15T10:34:06Z)
Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z)
Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects [84.36935309169567]
We present a broad review of recent advances for fine-grained analysis in zero-shot learning (ZSL) We first provide a taxonomy of existing methods and techniques with a thorough analysis of each category. Then, we summarize the benchmark, covering publicly available datasets, models, implementations, and some more details as a library.
arXiv Detail & Related papers (2024-01-31T11:51:24Z)
Federated Learning for Generalization, Robustness, Fairness: A Survey and Benchmark [55.898771405172155]
Federated learning has emerged as a promising paradigm for privacy-preserving collaboration among different parties. We provide a systematic overview of the important and recent developments of research on federated learning.
arXiv Detail & Related papers (2023-11-12T06:32:30Z)
Large Language Model Alignment: A Survey [42.03229317132863]
The potential of large language models (LLMs) is undeniably vast; however, they may yield texts that are imprecise, misleading, or even detrimental. This survey endeavors to furnish an extensive exploration of alignment methodologies designed for LLMs. We also probe into salient issues including the models' interpretability, and potential vulnerabilities to adversarial attacks.
arXiv Detail & Related papers (2023-09-26T15:49:23Z)
Towards Trustworthy and Aligned Machine Learning: A Data-centric Survey with Causality Perspectives [11.63431725146897]
The trustworthiness of machine learning has emerged as a critical topic in the field. This survey presents the background of trustworthy machine learning development using a unified set of concepts. We provide a unified language with mathematical vocabulary to link these methods across robustness, adversarial robustness, interpretability, and fairness.
arXiv Detail & Related papers (2023-07-31T17:11:35Z)
Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [69.44384540002358]
We provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem. We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks. We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches.
arXiv Detail & Related papers (2021-04-23T11:07:07Z)
Weakly Supervised Object Localization and Detection: A Survey [145.5041117184952]
weakly supervised object localization and detection plays an important role for developing new generation computer vision systems. We review (1) classic models, (2) approaches with feature representations from off-the-shelf deep networks, (3) approaches solely based on deep learning, and (4) publicly available datasets and standard evaluation metrics that are widely used in this field. We discuss the key challenges in this field, development history of this field, advantages/disadvantages of the methods in each category, relationships between methods in different categories, applications of the weakly supervised object localization and detection methods, and potential future directions to further promote the development of this research field
arXiv Detail & Related papers (2021-04-16T06:44:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.