Related papers: Mining the Explainability and Generalization: Fact Verification Based on Self-Instruction

Mining the Explainability and Generalization: Fact Verification Based on Self-Instruction

URL: http://arxiv.org/abs/2405.12579v2
Date: Thu, 23 May 2024 08:02:37 GMT
Title: Mining the Explainability and Generalization: Fact Verification Based on Self-Instruction
Authors: Guangyao Lu, Yulin Liu,
Abstract summary: We propose a self-instruction based fine-tuning approach for fact-checking that balances accuracy and explainability. We fine-tune the smallest-scale LLaMA-7B model and evaluate it on the challenging fact-checking datasets FEVEROUS and HOVER. Our method is the first to leverage self-supervised learning for fact-checking and innovatively combines contrastive learning and improved DPO in fine-tuning LLMs.
Score: 0.7673339435080445
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fact-checking based on commercial LLMs has become mainstream. Although these methods offer high explainability, it falls short in accuracy compared to traditional fine-tuning approaches, and data security is also a significant concern. In this paper, we propose a self-instruction based fine-tuning approach for fact-checking that balances accuracy and explainability. Our method consists of Data Augmentation and Improved DPO fine-tuning. The former starts by instructing the model to generate both positive and negative explanations based on claim-evidence pairs and labels, then sampling the dataset according to our customized difficulty standards. The latter employs our proposed improved DPO to fine-tune the model using the generated samples. We fine-tune the smallest-scale LLaMA-7B model and evaluate it on the challenging fact-checking datasets FEVEROUS and HOVER, utilizing four fine-tuning methods and three few-shot learning methods for comparison. The experiments demonstrate that our approach not only retains accuracy comparable to, or even surpassing, traditional fine-tuning methods, but also generates fluent explanation text. Moreover, it also exhibit high generalization performance. Our method is the first to leverage self-supervised learning for fact-checking and innovatively combines contrastive learning and improved DPO in fine-tuning LLMs, as shown in the experiments.

Related papers

Daunce: Data Attribution through Uncertainty Estimation [7.809316632545256]
Training data attribution methods aim to identify which training examples influence a model's predictions on specific test data most.<n> Gradient-based TDA methods rely on gradients and second-order information, limiting their applicability at scale.<n>We introduce Daunce - a simple yet effective data attribution approach through uncertainty estimation.
arXiv Detail & Related papers (2025-05-29T08:08:38Z)
The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models [69.798277882245]
We introduce Unsupervised Prefix Fine-Tuning (UPFT) to enhance large language models' reasoning efficiency. UPFT removes the need for labeled data or exhaustive sampling. Experiments show that UPFT matches the performance of supervised methods.
arXiv Detail & Related papers (2025-03-04T18:56:03Z)
S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning [51.84977135926156]
We introduce S$2$R, an efficient framework that enhances LLM reasoning by teaching models to self-verify and self-correct during inference. Our results demonstrate that Qwen2.5-math-7B achieves an accuracy improvement from 51.0% to 81.6%, outperforming models trained on an equivalent amount of long-CoT distilled data.
arXiv Detail & Related papers (2025-02-18T13:40:22Z)
Curriculum-style Data Augmentation for LLM-based Metaphor Detection [7.4594050203808395]
We propose a method for metaphor detection by fine-tuning open-source LLMs. Our method achieves state-of-the-art performance across all baselines.
arXiv Detail & Related papers (2024-12-04T02:05:21Z)
A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning. Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation. We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA) Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z)
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback. Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data. We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z)
Knowledge Editing in Language Models via Adapted Direct Preference Optimization [50.616875565173274]
Large Language Models (LLMs) can become outdated over time. Knowledge Editing aims to overcome this challenge using weight updates that do not require expensive retraining.
arXiv Detail & Related papers (2024-06-14T11:02:21Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
Preference Learning Algorithms Do Not Learn Preference Rankings [62.335733662381884]
We study the conventional wisdom that preference learning trains models to assign higher likelihoods to more preferred outputs than less preferred outputs. We find that most state-of-the-art preference-tuned models achieve a ranking accuracy of less than 60% on common preference datasets.
arXiv Detail & Related papers (2024-05-29T21:29:44Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning [28.059563581973432]
Large Language Models (LLMs) often have sensitive, private, or copyrighted data during pre-training. LLMs unlearning aims to eliminate the influence of undesirable data from the pre-trained model. We propose Negative Preference Optimization (NPO) as a simple alignment-inspired method that could efficiently unlearn a target dataset.
arXiv Detail & Related papers (2024-04-08T21:05:42Z)
Improving the Adversarial Robustness of NLP Models by Information Bottleneck [112.44039792098579]
Non-robust features can be easily manipulated by adversaries to fool NLP models. In this study, we explore the feasibility of capturing task-specific robust features, while eliminating the non-robust ones by using the information bottleneck theory. We show that the models trained with our information bottleneck-based method are able to achieve a significant improvement in robust accuracy.
arXiv Detail & Related papers (2022-06-11T12:12:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.