Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously
- URL: http://arxiv.org/abs/2602.20042v1
- Date: Mon, 23 Feb 2026 16:51:43 GMT
- Title: Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously
- Authors: Han Bao, Yue Huang, Xiaoda Wang, Zheyuan Zhang, Yujun Zhou, Carl Yang, Xiangliang Zhang, Yanfang Ye,
- Abstract summary: We take the position that the dominant paradigm of General Alignment reaches a structural ceiling in settings with conflicting values.<n>We introduce Edge Alignment as a distinct approach in which systems preserve multi dimensional value structure.
- Score: 51.03213216886717
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models are being deployed in complex socio-technical systems, which exposes limits in current alignment practice. We take the position that the dominant paradigm of General Alignment, which compresses diverse human values into a single scalar reward, reaches a structural ceiling in settings with conflicting values, plural stakeholders, and irreducible uncertainty. These failures follow from the mathematics and incentives of scalarization and lead to \textbf{structural} value flattening, \textbf{normative} representation loss, and \textbf{cognitive} uncertainty blindness. We introduce Edge Alignment as a distinct approach in which systems preserve multi dimensional value structure, support plural and democratic representation, and incorporate epistemic mechanisms for interaction and clarification. To make this approach practical, we propose seven interdependent pillars organized into three phases. We identify key challenges in data collection, training objectives, and evaluation, outlining complementary technical and governance directions. Taken together, these measures reframe alignment as a lifecycle problem of dynamic normative governance rather than as a single instance optimization task.
Related papers
- Domain Expansion: A Latent Space Construction Framework for Multi-Task Learning [26.322513515274764]
Training a single network with multiple objectives often leads to conflicting gradients that degrade shared representations.<n>We introduce Domain Expansion, a framework that prevents these conflicts by restructuring the latent space itself.
arXiv Detail & Related papers (2026-01-27T21:30:21Z) - PRISM: Purified Representation and Integrated Semantic Modeling for Generative Sequential Recommendation [28.629759086187352]
We propose a novel generative recommendation framework, PRISM, with Purified Representation and Integrated Semantic Modeling.<n>PRISM consistently outperforms state-of-the-art baselines across four real-world datasets.
arXiv Detail & Related papers (2026-01-23T08:50:16Z) - MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization [56.074760766965085]
Group-Relative Policy Optimization has emerged as an efficient paradigm for aligning Large Language Models (LLMs)<n>We propose MAESTRO, which treats reward scalarization as a dynamic latent policy, leveraging the model's terminal hidden states as a semantic bottleneck.<n>We formulate this as a contextual bandit problem within a bi-level optimization framework, where a lightweight Conductor network co-evolves with the policy by utilizing group-relative advantages as a meta-reward signal.
arXiv Detail & Related papers (2026-01-12T05:02:48Z) - CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification [16.165585394745786]
Crisis classification in social media aims to extract actionable disaster-related information from posts.<n>Existing approaches primarily leverage deep learning to fuse textual and visual cues for crisis classification.<n>We introduce a causality-guided multimodal domain generalization framework that combines adversarial disentanglement with unified representation learning.
arXiv Detail & Related papers (2025-12-08T22:12:27Z) - ERIS: An Energy-Guided Feature Disentanglement Framework for Out-of-Distribution Time Series Classification [51.07970070817353]
An ideal time series classification (TSC) should be able to capture invariant representations.<n>Current methods are largely unguided, lacking the semantic direction required to isolate truly universal features.<n>We propose an end-to-end Energy-Regularized Information for Shift-Robustness framework to enable guided and reliable feature disentanglement.
arXiv Detail & Related papers (2025-08-19T12:13:41Z) - Principled Multimodal Representation Learning [99.53621521696051]
Multimodal representation learning seeks to create a unified representation space by integrating diverse data modalities.<n>Recent advances have investigated the simultaneous alignment of multiple modalities, yet several challenges remain.<n>We propose Principled Multimodal Representation Learning (PMRL), a novel framework that achieves simultaneous alignment of multiple modalities.
arXiv Detail & Related papers (2025-07-23T09:12:25Z) - Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models [30.07172193932125]
We show that the Joint Autoencoder Modulator (JAM) induces alignment even across independently trained representations.<n>Our findings offer theoretical insight into the structure of shared semantics and practical guidance for transforming generalist unimodal foundations into specialist multimodal models.
arXiv Detail & Related papers (2025-07-01T21:43:50Z) - Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems [1.634867961895661]
This position paper argues that the theoretical inconsistency often observed among Responsible AI (RAI) metrics should be embraced as a valuable feature rather than a flaw to be eliminated.<n>We contend that navigating these inconsistencies, by treating metrics as divergent objectives, yields three key benefits.
arXiv Detail & Related papers (2025-05-23T17:48:09Z) - Toward Adaptive Categories: Dimensional Governance for Agentic AI [0.0]
dimensional governance is a framework that tracks how decision authority, process autonomy, and accountability (the 3As) distribute dynamically across human-AI relationships.<n>A critical advantage of this approach is its ability to explicitly monitor system movement toward and across key governance thresholds.<n>We outline key dimensions, critical trust thresholds, and practical examples illustrating where rigid categorical frameworks fail.
arXiv Detail & Related papers (2025-05-16T14:43:12Z) - Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction [61.484280369655536]
Camera-based 3D Semantic Occupancy Prediction (SOP) is crucial for understanding complex 3D scenes from limited 2D image observations.<n>Existing SOP methods typically aggregate contextual features to assist the occupancy representation learning.<n>We introduce a new Hierarchical context alignment paradigm for a more accurate SOP (Hi-SOP)
arXiv Detail & Related papers (2024-12-11T09:53:10Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.