Related papers: How much do LLMs learn from negative examples?

How much do LLMs learn from negative examples?

URL: http://arxiv.org/abs/2503.14391v1
Date: Tue, 18 Mar 2025 16:26:29 GMT
Title: How much do LLMs learn from negative examples?
Authors: Shadi Hamdan, Deniz Yuret,
Abstract summary: It is during the final phase that large language models are exposed to negative examples.<n>This paper delves into the role of negative examples in the training of LLMs.
Score: 2.510320860734906
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) undergo a three-phase training process: unsupervised pre-training, supervised fine-tuning (SFT), and learning from human feedback (RLHF/DPO). Notably, it is during the final phase that these models are exposed to negative examples -- incorrect, rejected, or suboptimal responses to queries. This paper delves into the role of negative examples in the training of LLMs, using a likelihood-ratio (Likra) model on multiple-choice question answering benchmarks to precisely manage the influence and the volume of negative examples. Our findings reveal three key insights: (1) During a critical phase in training, Likra with negative examples demonstrates a significantly larger improvement per training example compared to SFT using only positive examples. This leads to a sharp jump in the learning curve for Likra unlike the smooth and gradual improvement of SFT; (2) negative examples that are plausible but incorrect (near-misses) exert a greater influence; and (3) while training with positive examples fails to significantly decrease the likelihood of plausible but incorrect answers, training with negative examples more accurately identifies them. These results indicate a potentially significant role for negative examples in improving accuracy and reducing hallucinations for LLMs.

Related papers

Contrastive Learning with Negative Sampling Correction [52.990001829393506]
We propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL) PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss. PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks.
arXiv Detail & Related papers (2024-01-13T11:18:18Z)
Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data [15.088675135566646]
Large Language Models (LLMs) have performed well on various reasoning tasks, but their inaccessibility and numerous parameters hinder wide application in practice. We propose a model specialization framework to distill LLMs with negative samples besides positive ones. We conduct extensive experiments across arithmetic reasoning tasks to demonstrate the role of negative data in distillation from LLM.
arXiv Detail & Related papers (2023-12-20T08:28:36Z)
Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling. The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z)
Language Model Pre-training on True Negatives [109.73819321246062]
Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Existing PLMs simply treat all corrupted texts as equal negative without any examination. We design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives.
arXiv Detail & Related papers (2022-12-01T12:24:19Z)
Investigating the Role of Negatives in Contrastive Representation Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning. We focus on disambiguating the role of one of these parameters: the number of negative examples. We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z)
Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework. We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z)
NPCFace: Negative-Positive Collaborative Training for Large-scale Face Recognition [78.21084529159577]
We study how to make better use of hard samples for improving the training. The correlation between hard positive and hard negative is overlooked, and so is the relation between the margins in positive and negative logits. We propose a novel Negative-Positive Collaboration loss, named NPCFace, which emphasizes the training on both negative and positive hard cases.
arXiv Detail & Related papers (2020-07-20T14:52:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.