Related papers: Token-Level Uncertainty-Aware Objective for Language Model Post-Training

Token-Level Uncertainty-Aware Objective for Language Model Post-Training

URL: http://arxiv.org/abs/2503.16511v1
Date: Sat, 15 Mar 2025 00:32:14 GMT
Title: Token-Level Uncertainty-Aware Objective for Language Model Post-Training
Authors: Tingkai Liu, Ari S. Benjamin, Anthony M. Zador,
Abstract summary: We connect token-level uncertainty in causal language modeling to two types of training objectives: 1) masked maximum likelihood (MLE), 2) self-distillation.<n>We show that masked MLE is effective in reducing epistemic uncertainty, and serve as an effective token-level automatic curriculum learning technique.<n>However, masked MLE is prone to overfitting and requires self-distillation regularization to improve or maintain performance on out-of-distribution tasks.
Score: 2.5671111123644894
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the current work, we connect token-level uncertainty in causal language modeling to two types of training objectives: 1) masked maximum likelihood (MLE), 2) self-distillation. We show that masked MLE is effective in reducing epistemic uncertainty, and serve as an effective token-level automatic curriculum learning technique. However, masked MLE is prone to overfitting and requires self-distillation regularization to improve or maintain performance on out-of-distribution tasks. We demonstrate significant performance gain via the proposed training objective - combined masked MLE and self-distillation - across multiple architectures (Gemma, LLaMA, Phi) and datasets (Alpaca, ShareGPT, GSM8K), mitigating overfitting while maintaining adaptability during post-training. Our findings suggest that uncertainty-aware training provides an effective mechanism for enhancing language model training.

Related papers

Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources [0.0]
In medical healthcare, obtaining detailed annotations is challenging, highlighting the need for robust Vision-Language Models (VLMs)<n>We focus on leveraging the momentum method combined with distillation to simultaneously address computational efficiency and knowledge exploitation.<n>Our method attains competitive performance with state-of-the-art (SOTA) approaches in zero-shot classification, while providing a substantial boost in the few-shot adaption.
arXiv Detail & Related papers (2025-12-02T05:53:51Z)
Reliable Unlearning Harmful Information in LLMs with Metamorphosis Representation Projection [17.369869625390894]
We propose a Metamorphosis Representation Projection (MRP) approach to machine unlearning.<n>By implementing projective transformations in the hidden state space of specific network layers, our method effectively eliminates harmful information while preserving useful knowledge.<n> Experimental results demonstrate that our approach enables effective continuous unlearning and successfully defends against relearning attacks.
arXiv Detail & Related papers (2025-08-21T11:12:09Z)
From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model [29.879983760203256]
Multimodal Large Language Models (MLLMs) have emerged as a promising solution for universal embedding tasks.<n>But adapting their generative nature for discriminative representation learning remains a significant challenge.<n>We propose an efficient framework for universal multimodal embeddings, which bridges the gap by centering on two synergistic components.
arXiv Detail & Related papers (2025-08-01T07:31:24Z)
ESLM: Risk-Averse Selective Language Modeling for Efficient Pretraining [53.893792844055106]
Large language model pretraining is compute-intensive, yet many tokens contribute marginally to learning, resulting in inefficiency.<n>We introduce Selective Efficient Language Modeling, a risk-aware algorithm that improves training efficiency and distributional robustness by performing online token-level batch selection.<n> Experiments on GPT-2 pretraining show that ESLM significantly reduces training FLOPs while maintaining or improving both perplexity and downstream performance compared to baselines.
arXiv Detail & Related papers (2025-05-26T12:23:26Z)
UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models [54.75551043657238]
We introduce UniErase, a novel unlearning paradigm that employs learnable parametric suffix (unlearning token) to steer language models toward targeted forgetting behaviors.<n>UniErase achieves state-of-the-art (SOTA) performance across batch, sequential, and precise unlearning under fictitious and real-world knowledge settings.
arXiv Detail & Related papers (2025-05-21T15:53:28Z)
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training [13.680205342714412]
Large language models (LLMs) have become the backbone of modern natural language processing but pose privacy concerns about leaking sensitive training data.<n>We propose a lightweight yet effective empirical privacy defense for protecting training data of language modeling by leveraging the token-specific characteristics.
arXiv Detail & Related papers (2025-02-27T03:37:45Z)
Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models [37.172662930947446]
Language models (LMs) are potentially vulnerable to extraction attacks, which represent a significant privacy risk. We propose Privacy Protection via Optimal Parameters (POP), a novel unlearning method that effectively forgets the target token sequences from the pretrained LM. POP exhibits remarkable retention performance post-unlearning across 9 classification and 4 dialogue benchmarks, outperforming the state-of-the-art by a large margin.
arXiv Detail & Related papers (2024-06-20T08:12:49Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.<n>It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses. C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z)
Pre-training Language Model as a Multi-perspective Course Learner [103.17674402415582]
This study proposes a multi-perspective course learning (MCL) method for sample-efficient pre-training. In this study, three self-supervision courses are designed to alleviate inherent flaws of "tug-of-war" dynamics. Our method significantly improves ELECTRA's average performance by 2.8% and 3.2% absolute points respectively on GLUE and SQuAD 2.0 benchmarks.
arXiv Detail & Related papers (2023-05-06T09:02:10Z)
Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner. SiMT generates the target model by adapting from the temporal ensemble of the meta-learner. We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z)
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning. Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning. We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z)
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models [54.569004548170824]
We show that careful masking strategies can bridge the knowledge gap of masked language models. We propose an effective training strategy by adversarially masking out those tokens which are harder to adversarial by the underlying.
arXiv Detail & Related papers (2020-10-05T01:49:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.