How Do Large Language Models Learn Concepts During Continual Pre-Training?
- URL: http://arxiv.org/abs/2601.03570v1
- Date: Wed, 07 Jan 2026 04:29:15 GMT
- Title: How Do Large Language Models Learn Concepts During Continual Pre-Training?
- Authors: Barry Menglong Yao, Sha Li, Yunzhi Yao, Minqian Liu, Zaishuo Xia, Qifan Wang, Lifu Huang,
- Abstract summary: We study how individual concepts are acquired and forgotten, as well as how multiple concepts interact through interference and synergy.<n>Our findings offer a circuit-level view of concept learning dynamics and inform the design of more interpretable and robust concept-aware training strategies.
- Score: 69.99800338599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human beings primarily understand the world through concepts (e.g., dog), abstract mental representations that structure perception, reasoning, and learning. However, how large language models (LLMs) acquire, retain, and forget such concepts during continual pretraining remains poorly understood. In this work, we study how individual concepts are acquired and forgotten, as well as how multiple concepts interact through interference and synergy. We link these behavioral dynamics to LLMs' internal Concept Circuits, computational subgraphs associated with specific concepts, and incorporate Graph Metrics to characterize circuit structure. Our analysis reveals: (1) LLMs concept circuits provide a non-trivial, statistically significant signal of concept learning and forgetting; (2) Concept circuits exhibit a stage-wise temporal pattern during continual pretraining, with an early increase followed by gradual decrease and stabilization; (3) concepts with larger learning gains tend to exhibit greater forgetting under subsequent training; (4) semantically similar concepts induce stronger interference than weakly related ones; (5) conceptual knowledge differs in their transferability, with some significantly facilitating the learning of others. Together, our findings offer a circuit-level view of concept learning dynamics and inform the design of more interpretable and robust concept-aware training strategies for LLMs.
Related papers
- Forget Less by Learning Together through Concept Consolidation [6.121904567143191]
Custom Diffusion Models (CDMs) have gained significant attention due to their remarkable ability to personalize generative processes.<n>Existing CDMs suffer from catastrophic forgetting when continuously learning new concepts.<n>We propose Forget Less by Learning Together (FL2T) that enables concurrent and order-agnostic concept learning.
arXiv Detail & Related papers (2026-01-05T10:14:16Z) - FaCT: Faithful Concept Traces for Explaining Neural Network Decisions [56.796533084868884]
Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge.<n>We put emphasis on the faithfulness of concept-based explanations and propose a new model with model-inherent mechanistic concept-explanations.<n>Our concepts are shared across classes and, from any layer, their contribution to the logit and their input-visualization can be faithfully traced.
arXiv Detail & Related papers (2025-10-29T13:35:46Z) - Neuro-Symbolic Concepts [72.94541757514396]
This article presents a concept-centric paradigm for building agents that can learn continually and reason flexibly.<n>The concept-centric agent utilizes a vocabulary of neuro-symbolic concepts.<n>This framework offers several advantages, including data efficiency, compositional generalization, continual learning, and zero-shot transfer.
arXiv Detail & Related papers (2025-05-09T17:02:51Z) - Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models [25.84386438333865]
We show that concepts and classes form a complex web of relationships, which is susceptible to degradation and needs to be preserved and augmented across experiences.<n>We propose a novel method - MuCIL - that uses multimodal concepts to perform classification without increasing the number of trainable parameters across experiences.
arXiv Detail & Related papers (2025-02-27T18:59:29Z) - Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions [13.877511370053794]
Concept Bottleneck Models (CBM) address some of these challenges by learning interpretable concepts from high-dimensional data.<n>We describe a framework that provides theoretical guarantees on the correctness of the learned concepts and on the number of required labels.<n>We evaluate our framework in synthetic and image benchmarks, showing that the learned concepts have less impurities and are often more accurate than other CBMs.
arXiv Detail & Related papers (2025-02-10T15:01:56Z) - Revealing emergent human-like conceptual representations from language prediction [90.73285317321312]
Large language models (LLMs) trained solely through next-token prediction on text exhibit strikingly human-like behaviors.<n>Are these models developing concepts akin to those of humans?<n>We found that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts.
arXiv Detail & Related papers (2025-01-21T23:54:17Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Towards Concept-Aware Large Language Models [56.48016300758356]
Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication.
There is very little work on endowing machines with the ability to form and reason with concepts.
In this work, we analyze how well contemporary large language models (LLMs) capture human concepts and their structure.
arXiv Detail & Related papers (2023-11-03T12:19:22Z) - Concept Representation Learning with Contrastive Self-Supervised
Learning [0.6091702876917281]
Concept-oriented deep learning (CODL) is a general approach to meet the future challenges for deep learning.
We discuss major aspects of concept representation learning using Contrastive Self-supervised Learning (CSSL)
arXiv Detail & Related papers (2021-12-10T17:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.