Related papers: In Praise of Stubbornness: The Case for Cognitive-Dissonance-Aware Knowledge Updates in LLMs

In Praise of Stubbornness: The Case for Cognitive-Dissonance-Aware Knowledge Updates in LLMs

URL: http://arxiv.org/abs/2502.04390v1
Date: Wed, 05 Feb 2025 23:49:33 GMT
Title: In Praise of Stubbornness: The Case for Cognitive-Dissonance-Aware Knowledge Updates in LLMs
Authors: Simone Clemente, Zied Ben Houidi, Alexis Huet, Dario Rossi, Giulio Franzese, Pietro Michiardi,
Abstract summary: Large language models (LLMs) struggle to continually update their knowledge without catastrophic forgetting.<n>Humans effortlessly integrate new information, detect conflicts with existing beliefs, and selectively update their mental models.<n>This paper introduces a cognitive-inspired investigation paradigm to study continual knowledge updating in LLMs.
Score: 12.126745558519737
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite remarkable capabilities, large language models (LLMs) struggle to continually update their knowledge without catastrophic forgetting. In contrast, humans effortlessly integrate new information, detect conflicts with existing beliefs, and selectively update their mental models. This paper introduces a cognitive-inspired investigation paradigm to study continual knowledge updating in LLMs. We implement two key components inspired by human cognition: (1) Dissonance and Familiarity Awareness, analyzing model behavior to classify information as novel, familiar, or dissonant; and (2) Targeted Network Updates, which track neural activity to identify frequently used (stubborn) and rarely used (plastic) neurons. Through carefully designed experiments in controlled settings, we uncover a number of empirical findings demonstrating the potential of this approach. First, dissonance detection is feasible using simple activation and gradient features, suggesting potential for cognitive-inspired training. Second, we find that non-dissonant updates largely preserve prior knowledge regardless of targeting strategy, revealing inherent robustness in LLM knowledge integration. Most critically, we discover that dissonant updates prove catastrophically destructive to the model's knowledge base, indiscriminately affecting even information unrelated to the current updates. This suggests fundamental limitations in how neural networks handle contradictions and motivates the need for new approaches to knowledge updating that better mirror human cognitive mechanisms.

Related papers

NeuronTune: Towards Self-Guided Spurious Bias Mitigation [26.544938760265136]
Deep neural networks often develop spurious bias, reliance on correlations between non-essential features and classes for predictions.<n>Existing mitigation approaches typically depend on external annotations of spurious correlations.<n>We propose NeuronTune, a post hoc method that directly intervenes in a model's internal decision process.
arXiv Detail & Related papers (2025-05-29T22:33:00Z)
Semi-parametric Memory Consolidation: Towards Brain-like Deep Continual Learning [59.35015431695172]
We propose a novel biomimetic continual learning framework that integrates semi-parametric memory and the wake-sleep consolidation mechanism. For the first time, our method enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios.
arXiv Detail & Related papers (2025-04-20T19:53:13Z)
Hybrid Learners Do Not Forget: A Brain-Inspired Neuro-Symbolic Approach to Continual Learning [20.206972068340843]
Continual learning is crucial for creating AI agents that can learn and improve themselves autonomously. Inspired by the two distinct systems in the human brain, we propose a Neuro-Symbolic Brain-Inspired Continual Learning framework.
arXiv Detail & Related papers (2025-03-16T20:09:19Z)
FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge [24.858928681280634]
We define a new concept called superficial unlearning, which refers to the phenomenon where an unlearning method fails to erase interconnected knowledge. Based on the definition, we introduce a new benchmark, FaithUn, to analyze and evaluate the faithfulness of unlearning in real-world knowledge QA settings. We propose a novel unlearning method, KLUE, which updates only knowledge-related neurons to achieve faithful unlearning.
arXiv Detail & Related papers (2025-02-26T15:11:03Z)
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [92.88889953768455]
Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge. We identify computational subgraphs that facilitate knowledge storage and processing.
arXiv Detail & Related papers (2025-02-16T16:55:43Z)
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners [0.0]
We introduce Composite Learning Units (CLUs) designed to transform reasoners into learners capable of continuous learning. CLUs are built on an architecture that allows a reasoning model to maintain and evolve a dynamic knowledge repository. We demonstrate CLUs' effectiveness through a cryptographic reasoning task, where they continuously evolve their understanding through feedback to uncover hidden transformation rules.
arXiv Detail & Related papers (2024-10-09T02:27:58Z)
R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning) Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z)
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination" We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z)
Fixed Inter-Neuron Covariability Induces Adversarial Robustness [26.878913741674058]
The vulnerability to adversarial perturbations is a major flaw of Deep Neural Networks (DNNs) We have developed the Self-Consistent Activation layer, which comprises of neurons whose activations are consistent with each other, as they conform to a fixed, but learned, covariability pattern. The models with a SCA layer achieved high accuracy, and exhibited significantly greater robustness than multi-layer perceptron models to state-of-the-art Auto-PGD adversarial attacks textitwithout being trained on adversarially perturbed data.
arXiv Detail & Related papers (2023-08-07T23:46:14Z)
Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference [20.5696436171006]
Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately. We frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data. In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets.
arXiv Detail & Related papers (2023-06-19T09:06:44Z)
Meta-Learning in Spiking Neural Networks with Reward-Modulated STDP [2.179313476241343]
We propose a bio-plausible meta-learning model inspired by the hippocampus and the prefrontal cortex. Our new model can easily be applied to spike-based neuromorphic devices and enables fast learning in neuromorphic hardware.
arXiv Detail & Related papers (2023-06-07T13:08:46Z)
Mitigating Temporal Misalignment by Discarding Outdated Facts [58.620269228776294]
Large language models are often used under temporal misalignment, tasked with answering questions about the present. We propose fact duration prediction: the task of predicting how long a given fact will remain true. Our data and code are released publicly at https://github.com/mikejqzhang/mitigating_misalignment.
arXiv Detail & Related papers (2023-05-24T07:30:08Z)
Discover and Cure: Concept-aware Mitigation of Spurious Correlation [14.579651844642616]
Deep neural networks often rely on spurious correlations to make predictions. We propose an interpretable framework, Discover and Cure (DISC) to tackle the issue. DISC provides superior generalization ability and interpretability than the existing approaches.
arXiv Detail & Related papers (2023-05-01T04:19:27Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning [71.43841235954453]
Current deep learning models often suffer from catastrophic forgetting of old knowledge when continually learning new knowledge. Existing strategies to alleviate this issue often fix the trade-off between keeping old knowledge (stability) and learning new knowledge (plasticity)
arXiv Detail & Related papers (2023-01-18T05:36:06Z)
Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training. We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z)
Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain. It tackles the problem from two aspects: extracting knowledge and memorizing knowledge. It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z)
Reducing Catastrophic Forgetting in Self Organizing Maps with Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data. One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples. This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z)
Association: Remind Your GAN not to Forget [11.653696510515807]
We propose a brain-like approach that imitates the associative learning process to achieve continual learning. Experiments demonstrate the effectiveness of our method in alleviating catastrophic forgetting on image-to-image translation tasks.
arXiv Detail & Related papers (2020-11-27T04:43:15Z)
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)
Adversarial vs behavioural-based defensive AI with joint, continual and active learning: automated evaluation of robustness to deception, poisoning and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security. In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.