Related papers: Alignment Entropy Regularization

Related papers

Learnable Chernoff Baselines for Inference-Time Alignment [64.81256817158851]
We introduce Learnable Chernoff Baselines as a method for efficiently and approximately sampling from exponentially tilted kernels.<n>We establish total-variation guarantees to the ideal aligned model, and demonstrate in both continuous and discrete diffusion settings that LCB sampling closely matches ideal rejection sampling.
arXiv Detail & Related papers (2026-02-08T00:09:40Z)
Rethinking Test-Time Training: Tilting The Latent Distribution For Few-Shot Source-Free Adaptation [3.5808917363708743]
We study test-time adaptation of foundation models for few-shot classification under a completely frozen-model regime.<n>We propose arguably the first training-free inference method that adapts predictions to the new task by performing a change of measure over the latent embedding distribution induced by the encoder.
arXiv Detail & Related papers (2026-02-02T18:17:29Z)
Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning [55.2818264614932]
RankTuner introduces a probability--entropy calibration signal, the Relative Rank Indicator, which compares the rank of the ground-truth token with its expected rank under the prediction distribution.<n>The inverse indicator is used as a token-wise Relative Scale to reweight the fine-tuning objective, focusing updates on truly under-learned tokens.
arXiv Detail & Related papers (2026-02-02T07:27:19Z)
Distributional Properties of Subword Regularization [25.824110425757198]
BPE and MaxMatch, two popular subword tokenization schemes, have dropout regularization variants. We show that these variants are heavily biased towards a small set of tokenizations per word. We propose an algorithm to uniformly tokenizations that we use as a drop-in replacement for the aspects of existing tokenizers.
arXiv Detail & Related papers (2024-08-21T08:53:35Z)
Decoding-time Realignment of Language Models [44.54462397717971]
We propose a simple method to explore and evaluate different regularization strengths in aligned models without retraining. DeRa enables control over the degree of alignment, allowing users to smoothly transition between unaligned and aligned models.
arXiv Detail & Related papers (2024-02-05T13:31:28Z)
Revisiting the Entropy Semiring for Neural Speech Recognition [17.408741279118857]
We show how alignment entropy can be used to supervise models through regularization or distillation. We also contribute an open-source implementation of CTC and RNN-T in the semiring framework.
arXiv Detail & Related papers (2023-12-13T05:22:45Z)
Hard Nominal Example-aware Template Mutual Matching for Industrial Anomaly Detection [74.9262846410559]
textbfHard Nominal textbfExample-aware textbfTemplate textbfMutual textbfMatching (HETMM) textitHETMM aims to construct a robust prototype-based decision boundary, which can precisely distinguish between hard-nominal examples and anomalies.
arXiv Detail & Related papers (2023-03-28T17:54:56Z)
Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling. deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective. This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z)
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization [50.41984119504716]
We present a new paradigm for fine-tuning large-scale vision pre-trained models on downstream task, dubbed Prompt Regularization (ProReg) ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning. We show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.
arXiv Detail & Related papers (2023-01-29T11:53:55Z)
Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z)
Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters. We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
Causally-motivated Shortcut Removal Using Auxiliary Labels [63.686580185674195]
Key challenge to learning such risk-invariant predictors is shortcut learning. We propose a flexible, causally-motivated approach to address this challenge. We show both theoretically and empirically that this causally-motivated regularization scheme yields robust predictors.
arXiv Detail & Related papers (2021-05-13T16:58:45Z)
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data [42.58055728867802]
Fine-tuned pre-trained language models can suffer from severe miscalibration for both in-distribution and out-of-distribution data. We propose a regularized fine-tuning method to mitigate this issue. Our method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and OOD detection on six datasets.
arXiv Detail & Related papers (2020-10-22T07:48:38Z)
Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case. We find that variance in model performance can be explained largely by the resulting entropy of the model. We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.