Related papers: The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models

The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models

URL: http://arxiv.org/abs/2310.02457v2
Date: Wed, 15 Nov 2023 18:02:03 GMT
Title: The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Authors: Hannah Rose Kirk, Bertie Vidgen, Paul R\"ottger, Scott A. Hale
Abstract summary: We address the concept of "alignment" in large language models (LLMs) through the lens of post-structuralist socio-political theory. We propose a framework that demarcates: 1) which dimensions of model behaviour are considered important, then 2) how meanings and definitions are ascribed to these dimensions. We aim to foster a culture of transparency and critical evaluation, aiding the community in navigating the complexities of aligning LLMs with human populations.
Score: 18.16062736448993
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we address the concept of "alignment" in large language models (LLMs) through the lens of post-structuralist socio-political theory, specifically examining its parallels to empty signifiers. To establish a shared vocabulary around how abstract concepts of alignment are operationalised in empirical datasets, we propose a framework that demarcates: 1) which dimensions of model behaviour are considered important, then 2) how meanings and definitions are ascribed to these dimensions, and by whom. We situate existing empirical literature and provide guidance on deciding which paradigm to follow. Through this framework, we aim to foster a culture of transparency and critical evaluation, aiding the community in navigating the complexities of aligning LLMs with human populations.

Related papers

Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models [29.59537209390697]
We introduce a framework that trains modality-specific autoencoders on latent representations of single modality models.<n>By analogy, this framework serves as a method to escape Plato's Cave, enabling the emergence of shared structure from disjoint inputs.
arXiv Detail & Related papers (2025-07-01T21:43:50Z)
Large Language Models as Quasi-crystals: Coherence Without Repetition in Generative Text [0.0]
essay proposes an analogy between large language models (LLMs) and quasicrystals, systems that exhibit global coherence without periodic repetition, generated through local constraints. Drawing on the history of quasicrystals, it highlights an alternative mode of coherence in generative language: constraint-based organization without repetition or symbolic intent. This essay aims to reframe the current discussion around large language models, not by rejecting existing methods, but by suggesting an additional axis of interpretation grounded in structure rather than semantics.
arXiv Detail & Related papers (2025-04-16T11:27:47Z)
Scopes of Alignment [38.65920343856857]
Much of the research focus on AI alignment seeks to align large language models to generic values of helpfulness, harmlessness, and honesty. In this paper, we motivate why we need to move beyond such a limited conception and propose three dimensions for doing so.
arXiv Detail & Related papers (2025-01-15T03:06:59Z)
Generative Emergent Communication: Large Language Model is a Collective World Model [11.224401802231707]
Large Language Models (LLMs) have demonstrated a remarkable ability to capture extensive world knowledge.<n>This study proposes a novel theoretical solution by introducing the Collective World Model hypothesis.
arXiv Detail & Related papers (2024-12-31T02:23:10Z)
Do Large Language Models Advocate for Inferentialism? [0.0]
The emergence of large language models (LLMs) such as ChatGPT and Claude presents new challenges for philosophy of language.<n>This paper explores Robert Brandom's inferential semantics as an alternative foundational framework for understanding these systems.
arXiv Detail & Related papers (2024-12-19T03:48:40Z)
Discriminative Fine-tuning of LVLMs [67.14293827774827]
Contrastively-trained Vision-Language Models (VLMs) like CLIP have become the de facto approach for discriminative vision-language representation learning. We propose to combine "the best of both worlds": a new training approach for discriminative fine-tuning of LVLMs.
arXiv Detail & Related papers (2024-12-05T17:54:27Z)
Language Models as Semiotic Machines: Reconceptualizing AI Language Systems through Structuralist and Post-Structuralist Theories of Language [0.0]
This paper proposes a novel framework for understanding large language models (LLMs) I argue that LLMs should be understood as models of language itself, aligning with Jacques's concept of 'writing' (l'ecriture) I apply Saussure's critique of Saussure to position 'writing' as the object modeled by LLMs, offering a view of the machine's'mind' as a statistical approximation of sign behavior.
arXiv Detail & Related papers (2024-10-16T21:45:54Z)
Towards an Analysis of Discourse and Interactional Pragmatic Reasoning Capabilities of Large Language Models [0.0]
We discuss the scope of the field of pragmatics and suggest a subdivision into discourse pragmatics and interactional pragmatics. We consider the resulting heterogeneous set of phenomena and methods as a starting point for our survey of work on discourse pragmatics and interactional pragmatics in the context of LLMs.
arXiv Detail & Related papers (2024-08-06T10:02:05Z)
A Theory of LLM Sampling: Part Descriptive and Part Prescriptive [53.08398658452411]
Large Language Models (LLMs) are increasingly utilized in autonomous decision-making. We show that this sampling behavior resembles that of human decision-making. We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains.
arXiv Detail & Related papers (2024-02-16T18:28:43Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)
"We Demand Justice!": Towards Social Context Grounding of Political Texts [19.58924256275583]
Social media discourse frequently consists of'seemingly similar language used by opposing sides of the political spectrum' This paper defines the context required to fully understand such ambiguous statements in a computational setting. We propose two challenging datasets that require an understanding of the real-world context of the text.
arXiv Detail & Related papers (2023-11-15T16:53:35Z)
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial. We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
Which Mutual-Information Representation Learning Objectives are Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data. This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy. Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z)
VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling [24.775371434410328]
We tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases. Existing approaches for this task are discriminative, combining distributional and lexical semantics in an implicit rather than direct way. We propose a generative model for the task, introducing a continuous latent variable to explicitly model the underlying relationship between a phrase used within a context and its definition.
arXiv Detail & Related papers (2020-10-07T02:48:44Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.