Related papers: LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces

Related papers

Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences [14.686788596611246]
Reinforcement Learning from Human Feedback (RLHF) has become central to aligning large language models with human values.<n>Recent alternatives such as Direct Preference Optimization (DPO) simplify this pipeline by directly optimizing on preferences.<n>We propose a theoretical and algorithmic framework for fairness and personalization for diverse users in generative model alignment.
arXiv Detail & Related papers (2025-10-17T15:00:40Z)
Spatially Grounded Explanations in Vision Language Models for Document Visual Question Answering [7.981907917890143]
We introduce EaGERS, a fully training-free and model-agnostic pipeline that generates natural language rationales via a vision language model.<n>We show that our best configuration outperforms the base model on exact match accuracy and Average Normalized Levenshtein Similarity metrics.
arXiv Detail & Related papers (2025-07-15T20:05:25Z)
Why Settle for Mid: A Probabilistic Viewpoint to Spatial Relationship Alignment in Text-to-image Models [3.5999252362400993]
A prevalent issue in compositional generation is the misalignment of spatial relationships.<n>We introduce a novel evaluation metric designed to assess the alignment of 2D and 3D spatial relationships between text and image.<n>We also propose PoS-based Generation, an inference-time method that improves the alignment of 2D and 3D spatial relationships in T2I models without requiring fine-tuning.
arXiv Detail & Related papers (2025-06-29T22:41:27Z)
OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization [54.60030826635478]
Existing indoor layout generation methods fall into two categories: prompt-driven and learning-based.<n>We present 3D- SynthPlace, a large-scale dataset that combines synthetic layouts generated via a 'GPT synthesize, Human inspect' pipeline.<n>We introduce OptiScene, a strong open-source LLM optimized for indoor layout generation.
arXiv Detail & Related papers (2025-06-09T09:13:06Z)
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time [52.230936493691985]
We propose SITAlign, an inference framework that addresses the multifaceted nature of alignment by maximizing a primary objective while satisfying threshold-based constraints on secondary criteria.<n>We provide theoretical insights by deriving sub-optimality bounds of our satisficing based inference alignment approach.
arXiv Detail & Related papers (2025-05-29T17:56:05Z)
Split Matching for Inductive Zero-shot Semantic Segmentation [52.90218623214213]
Zero-shot Semantic (ZSS) aims to segment categories that are not annotated during training.<n>We propose Split Matching (SM), a novel assignment strategy that decouples Hungarian matching into two components.<n>SM is the first to introduce decoupled Hungarian matching under the inductive ZSS setting, and achieves state-of-the-art performance on two standard benchmarks.
arXiv Detail & Related papers (2025-05-08T07:56:30Z)
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes [54.93980123979578]
We introduce Latent Preference Coding (LPC), a novel framework that models the implicit factors as well as their combinations behind holistic preferences.<n>LPC seamlessly integrates with various offline alignment algorithms, automatically inferring the underlying factors and their importance from data.
arXiv Detail & Related papers (2025-05-08T06:59:06Z)
Beyond Relevance: An Adaptive Exploration-Based Framework for Personalized Recommendations [0.0]
This paper introduces an exploration-based recommendation framework to promote diversity and novelty without compromising relevance. A user-controlled exploration mechanism enhances diversity by selectively sampling from under-explored clusters. Experiments on the MovieLens dataset show that enabling exploration reduces intra-list similarity from 0.34 to 0.26 and increases unexpectedness to 0.73.
arXiv Detail & Related papers (2025-03-25T10:27:32Z)
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment [41.96246165999026]
Large language models (LLMs) have traditionally been aligned through one-size-fits-all approaches. This paper introduces a comprehensive framework for scalable personalized alignment of LLMs.
arXiv Detail & Related papers (2025-03-19T17:41:46Z)
RankPO: Preference Optimization for Job-Talent Matching [7.385902340910447]
We propose a two-stage training framework for large language models (LLMs) In the first stage, a contrastive learning approach is used to train the model on a dataset constructed from real-world matching rules. In the second stage, we introduce a novel preference-based fine-tuning method inspired by Direct Preference Optimization (DPO) to align the model with AI-curated pairwise preferences.
arXiv Detail & Related papers (2025-03-13T10:14:37Z)
Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations [36.29709573877113]
Group Preference Alignment identifies context-specific variations in conversational preferences across user groups.<n>Our framework significantly improves alignment of the output with respect to user preferences and outperforms baseline methods.
arXiv Detail & Related papers (2025-03-11T04:32:54Z)
A RankNet-Inspired Surrogate-Assisted Hybrid Metaheuristic for Expensive Coverage Optimization [5.757318591302855]
We propose a RankNet-Inspired Surrogate-assisted Hybrid Metaheuristic to handle large-scale coverage optimization tasks.<n>Our algorithm consistently outperforms state-of-the-art algorithms for EMVOPs.
arXiv Detail & Related papers (2025-01-13T14:49:05Z)
No Preference Left Behind: Group Distributional Preference Optimization [46.98320272443297]
Group Distribution Preference Optimization (GDPO) is a novel framework that aligns language models with the distribution of preferences within a group.<n>GDPO calibrates a language model using statistical estimation of the group's belief distribution.<n>GDPO consistently reduces this alignment gap during training.
arXiv Detail & Related papers (2024-12-28T23:30:47Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
Preference Alignment Improves Language Model-Based TTS [76.70693823683091]
preference alignment algorithms adjust LMs to align with the preferences of reward models, enhancing the desirability of the generated content. With a 1.15B parameter LM-based TTS model, we demonstrate that preference alignment consistently improves intelligibility, speaker similarity, and proxy subjective evaluation scores.
arXiv Detail & Related papers (2024-09-19T01:58:19Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
ABC Align: Large Language Model Alignment for Safety & Accuracy [0.0]
We present ABC Align, a novel alignment methodology for Large Language Models (LLMs) We combine a set of data and methods that build on recent breakthroughs in synthetic data generation, preference optimisation, and post-training model quantisation. Our unified approach mitigates bias and improves accuracy, while preserving reasoning capability, as measured against standard benchmarks.
arXiv Detail & Related papers (2024-08-01T06:06:25Z)
Improving Context-Aware Preference Modeling for Language Models [62.32080105403915]
We consider the two-step preference modeling procedure that first resolves the under-specification by selecting a context, and then evaluates preference with respect to the chosen context. We contribute context-conditioned preference datasets and experiments that investigate the ability of language models to evaluate context-specific preference.
arXiv Detail & Related papers (2024-07-20T16:05:17Z)
Direct Preference Optimization With Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences [14.686788596611246]
Reinforcement Learning from Human Feedback (RLHF) has become central to aligning large language models with human values.<n>Recent alternatives such as Direct Preference Optimization (DPO) simplify this pipeline by directly optimizing on preferences.<n>We propose a theoretical and algorithmic framework for fairness and personalization for diverse users in generative model alignment.
arXiv Detail & Related papers (2024-05-23T21:25:20Z)
Personalized Collaborative Fine-Tuning for On-Device Large Language Models [33.68104398807581]
We explore on-device self-supervised collaborative fine-tuning of large language models with limited local data availability. We introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based. Our protocols, driven by prediction and performance metrics, surpass both FedAvg and local fine-tuning methods.
arXiv Detail & Related papers (2024-04-15T12:54:31Z)
Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC) LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses. LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z)
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment [103.12563033438715]
Alignment in artificial intelligence pursues consistency between model responses and human preferences as well as values. Existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives.
arXiv Detail & Related papers (2024-02-29T12:12:30Z)
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step. Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z)
HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones. We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains. Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.