Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training
- URL: http://arxiv.org/abs/2410.15460v1
- Date: Sun, 20 Oct 2024 18:18:23 GMT
- Title: Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training
- Authors: Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi,
- Abstract summary: We introduce SEnsitive Neuron Dropout (SeND), a novel training protocol designed to mitigate hallucinations by reducing variance during training.
In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore in 2x speed.
- Score: 7.726825072908519
- License:
- Abstract: As large language models (LLMs) become increasingly deployed across various industries, concerns regarding their reliability, particularly due to hallucinations-outputs that are factually inaccurate or irrelevant to user input-have grown. Our research investigates the relationship between the training process and the emergence of hallucinations to address a key gap in existing research that focuses primarily on post hoc detection and mitigation strategies. Using models from the Pythia suite (70M-12B parameters) and several hallucination detection metrics, we analyze hallucination trends throughout training and explore LLM internal dynamics. We introduce SEnsitive Neuron Dropout (SeND), a novel training protocol designed to mitigate hallucinations by reducing variance during training. SeND achieves this by deterministically dropping neurons with significant variability on a dataset, referred to as Sensitive Neurons. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore in 2x speed. This efficient metric is integrated into our protocol, allowing SeND to be both computationally scalable and effective at reducing hallucinations. Our empirical evaluation demonstrates that our approach improves LLM reliability at test time by up to 40% compared to normal training while also providing an efficient method to improve factual accuracy when adapting LLMs to domains such as Wikipedia and Medical datasets.
Related papers
- Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning [16.883679810267342]
Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
This paper introduces a novel approach called Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination.
arXiv Detail & Related papers (2024-10-16T00:15:40Z) - ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability [27.325766792146936]
hallucinations caused by insufficient parametric (internal) knowledge.
Detecting such hallucinations requires disentangling how Large Language Models (LLMs) utilize external and parametric knowledge.
We propose ReDeEP, a novel method that detects hallucinations by decoupling LLM's utilization of external context and parametric knowledge.
arXiv Detail & Related papers (2024-10-15T09:02:09Z) - Discovering Long-Term Effects on Parameter Efficient Fine-tuning [36.83255498301937]
Pre-trained Artificial Neural Networks (Annns) exhibit robust pattern recognition capabilities.
Annns and BNNs share extensive similarities with the human brain, specifically Biological Neural Networks (BNNs)
Annns can acquire new knowledge through fine-tuning.
arXiv Detail & Related papers (2024-08-24T03:27:29Z) - Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability [83.0884072598828]
Hallucinations come in many forms, and there is no universally accepted definition.
We focus on studying only those hallucinations where a correct answer appears verbatim in the training set.
We find that for a fixed dataset, larger and longer-trained LMs hallucinate less.
While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations.
arXiv Detail & Related papers (2024-08-14T23:34:28Z) - Self-Supervised Pretext Tasks for Alzheimer's Disease Classification using 3D Convolutional Neural Networks on Large-Scale Synthetic Neuroimaging Dataset [11.173478552040441]
Alzheimer's Disease (AD) induces both localised and widespread neural degenerative changes throughout the brain.
In this work, we evaluated several unsupervised methods to train a feature extractor for downstream AD vs. CN classification.
arXiv Detail & Related papers (2024-06-20T11:26:32Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - Reducing LLM Hallucinations using Epistemic Neural Networks [0.0]
We train an ENN on top of the Llama-2 7B model combined with a contrastive decoding feature enhancement technique.
We are the first to train an ENN for the next token prediction task and explore the efficacy of this method in reducing hallucinations on the TruthfulQA dataset.
arXiv Detail & Related papers (2023-12-25T01:17:01Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - A New Benchmark and Reverse Validation Method for Passage-level
Hallucination Detection [63.56136319976554]
Large Language Models (LLMs) generate hallucinations, which can cause significant damage when deployed for mission-critical tasks.
We propose a self-check approach based on reverse validation to detect factual errors automatically in a zero-resource fashion.
We empirically evaluate our method and existing zero-resource detection methods on two datasets.
arXiv Detail & Related papers (2023-10-10T10:14:59Z) - Contrastive Learning Reduces Hallucination in Conversations [76.55116206021346]
We propose a contrastive learning scheme, named MixCL.
A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs.
We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches.
arXiv Detail & Related papers (2022-12-20T16:26:18Z) - Detecting Parkinsonian Tremor from IMU Data Collected In-The-Wild using
Deep Multiple-Instance Learning [59.74684475991192]
Parkinson's Disease (PD) is a slowly evolving neuro-logical disease that affects about 1% of the population above 60 years old.
PD symptoms include tremor, rigidity and braykinesia.
We present a method for automatically identifying tremorous episodes related to PD, based on IMU signals captured via a smartphone device.
arXiv Detail & Related papers (2020-05-06T09:02:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.