Related papers: DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

URL: http://arxiv.org/abs/2501.03271v3
Date: Mon, 20 Jan 2025 04:24:56 GMT
Title: DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
Authors: Amitava Das, Suranjana Trivedy, Danush Khanna, Rajarshi Roy, Gurpreet Singh, Basab Ghosh, Yaswanth Narsupalli, Vinija Jain, Vasu Sharma, Aishwarya Naresh Reganti, Aman Chadha,
Abstract summary: Large language models (LLMs) have unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences.<n>Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations.
Score: 6.303144414273044
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations. We propose DPO-Kernels, which integrates kernel methods to address these issues through four key contributions: (i) Kernelized Representations with polynomial, RBF, Mahalanobis, and spectral kernels for richer transformations, plus a hybrid loss combining embedding-based and probability-based objectives; (ii) Divergence Alternatives (Jensen-Shannon, Hellinger, Renyi, Bhattacharyya, Wasserstein, and f-divergences) for greater stability; (iii) Data-Driven Selection metrics that automatically choose the best kernel-divergence pair; and (iv) a Hierarchical Mixture of Kernels for both local precision and global modeling. Evaluations on 12 datasets demonstrate state-of-the-art performance in factuality, safety, reasoning, and instruction following. Grounded in Heavy-Tailed Self-Regularization, DPO-Kernels maintains robust generalization for LLMs, offering a comprehensive resource for further alignment research.

Related papers

DETONATE: A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization [4.496316647545223]
This paper introduces DPO- Kernels for text-to-image (T2I) models, a novel extension enhancing alignment across three dimensions.<n>We introduce DETONATE, the first large-scale benchmark of its kind, comprising approximately 100K curated image pairs.<n>We also propose the Alignment Quality Index (AQI), a novel geometric measure of latent-space separability of safe/unsafe image activations.
arXiv Detail & Related papers (2025-06-17T18:17:35Z)
MIK: Modified Isolation Kernel for Biological Sequence Visualization, Classification, and Clustering [3.9146761527401424]
This research proposes a novel approach called the Modified Isolation Kernel (MIK) as an alternative to the Gaussian kernel. MIK uses adaptive density estimation to capture local structures more accurately and integrates robustness measures. It exhibits improved preservation of the local and global structure and enables better visualization of clusters and subclusters in the embedded space.
arXiv Detail & Related papers (2024-10-21T06:57:09Z)
Multiple Kernel Clustering via Local Regression Integration [4.856913393644719]
Multiple kernel methods less consider the intrinsic manifold structure of multiple kernel data. This paper first presents the clustering method via kernelized local regression (CKLR) We then extend it to perform clustering via the multiple kernel local regression (CMKLR)
arXiv Detail & Related papers (2024-10-20T06:26:29Z)
Optimal Kernel Choice for Score Function-based Causal Discovery [92.65034439889872]
We propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data. We conduct experiments on both synthetic data and real-world benchmarks, and the results demonstrate that our proposed method outperforms kernel selection methods.
arXiv Detail & Related papers (2024-07-14T09:32:20Z)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values. We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO) Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z)
Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel [22.23550794664218]
This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework. The framework incorporates a quantized alternating direction method of multipliers (ADMM) for collaborative learning among multiple agents. Experiments on diverse datasets demonstrate the superior prediction performance and efficiency of our proposed methods.
arXiv Detail & Related papers (2023-09-15T07:05:33Z)
Local Sample-weighted Multiple Kernel Clustering with Consensus Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels. This paper proposes a novel local sample-weighted multiple kernel clustering model. Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z)
Supervising the Multi-Fidelity Race of Hyperparameter Configurations [22.408069485293666]
We introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a race among all feasible configurations. We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization methods through large-scale experiments.
arXiv Detail & Related papers (2022-02-20T10:28:02Z)
Kernel Selection for Stein Variational Gradient Descent [19.16800190883095]
We propose a combination of multiple kernels to approximate the optimal kernel instead of a single one. The proposed method not only gets rid of optimal kernel dependence but also maintains computational effectiveness.
arXiv Detail & Related papers (2021-07-20T08:48:42Z)
Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation [154.2195491708548]
We study the prediction discriminability and diversity by studying the structure of the classification output matrix of a randomly selected data batch. We propose Batch Nuclear-norm Maximization and Minimization, which performs nuclear-norm on the target output matrix to enhance the target prediction ability. Experiments show that our method could boost the adaptation accuracy and robustness under three typical domain adaptation scenarios.
arXiv Detail & Related papers (2021-07-13T15:08:32Z)
Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models. This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models. We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)
Flow-based Kernel Prior with Application to Blind Super-Resolution [143.21527713002354]
Kernel estimation is generally one of the key problems for blind image super-resolution (SR) This paper proposes a normalizing flow-based kernel prior (FKP) for kernel modeling. Experiments on synthetic and real-world images demonstrate that the proposed FKP can significantly improve the kernel estimation accuracy.
arXiv Detail & Related papers (2021-03-29T22:37:06Z)
Kernel k-Means, By All Means: Algorithms and Strong Consistency [21.013169939337583]
Kernel $k$ clustering is a powerful tool for unsupervised learning of non-linear data. In this paper, we generalize results leveraging a general family of means to combat sub-optimal local solutions. Our algorithm makes use of majorization-minimization (MM) to better solve this non-linear separation problem.
arXiv Detail & Related papers (2020-11-12T16:07:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.