The Empty Signifier Problem: Towards Clearer Paradigms for
Operationalising "Alignment" in Large Language Models
- URL: http://arxiv.org/abs/2310.02457v2
- Date: Wed, 15 Nov 2023 18:02:03 GMT
- Title: The Empty Signifier Problem: Towards Clearer Paradigms for
Operationalising "Alignment" in Large Language Models
- Authors: Hannah Rose Kirk, Bertie Vidgen, Paul R\"ottger, Scott A. Hale
- Abstract summary: We address the concept of "alignment" in large language models (LLMs) through the lens of post-structuralist socio-political theory.
We propose a framework that demarcates: 1) which dimensions of model behaviour are considered important, then 2) how meanings and definitions are ascribed to these dimensions.
We aim to foster a culture of transparency and critical evaluation, aiding the community in navigating the complexities of aligning LLMs with human populations.
- Score: 18.16062736448993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the concept of "alignment" in large language models
(LLMs) through the lens of post-structuralist socio-political theory,
specifically examining its parallels to empty signifiers. To establish a shared
vocabulary around how abstract concepts of alignment are operationalised in
empirical datasets, we propose a framework that demarcates: 1) which dimensions
of model behaviour are considered important, then 2) how meanings and
definitions are ascribed to these dimensions, and by whom. We situate existing
empirical literature and provide guidance on deciding which paradigm to follow.
Through this framework, we aim to foster a culture of transparency and critical
evaluation, aiding the community in navigating the complexities of aligning
LLMs with human populations.
Related papers
- Realizing Disentanglement in LM Latent Space via Vocabulary-Defined Semantics [32.178931149612644]
We introduce a pioneering approach called vocabulary-defined semantics, which establishes a reference frame grounded in LM vocabulary.
We perform semantical clustering on data representations as a novel way of LM adaptation.
Our approach outperforms state-of-the-art methods of retrieval-augmented generation and parameter-efficient finetuning.
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - "We Demand Justice!": Towards Social Context Grounding of Political
Texts [22.016345507132808]
Social media discourse frequently consists of'seemingly similar language used by opposing sides of the political spectrum'
This paper defines the context required to fully understand such ambiguous statements in a computational setting.
We propose two challenging datasets that require an understanding of the real-world context of the text.
arXiv Detail & Related papers (2023-11-15T16:53:35Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set
Alignment [17.423361070781876]
We propose the Disentangled Conceptualization and Set-to-set Alignment (DiCoSA) to simulate the conceptualizing and reasoning process of human beings.
For disentangled conceptualization, we divide the coarse feature into multiple latent factors related to semantic concepts.
For set-to-set alignment, where a set of visual concepts correspond to a set of textual concepts, we propose an adaptive pooling method to aggregate semantic concepts.
arXiv Detail & Related papers (2023-05-20T15:48:47Z) - Linear Spaces of Meanings: Compositional Structures in Vision-Language
Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs)
We first present a framework for understanding compositional structures from a geometric perspective.
We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Which Mutual-Information Representation Learning Objectives are
Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data.
This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy.
Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z) - VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word
Representations for Improved Definition Modeling [24.775371434410328]
We tackle the task of definition modeling, where the goal is to learn to generate definitions of words and phrases.
Existing approaches for this task are discriminative, combining distributional and lexical semantics in an implicit rather than direct way.
We propose a generative model for the task, introducing a continuous latent variable to explicitly model the underlying relationship between a phrase used within a context and its definition.
arXiv Detail & Related papers (2020-10-07T02:48:44Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.