Related papers: Unraveling the cognitive patterns of Large Language Models through module communities

Unraveling the cognitive patterns of Large Language Models through module communities

URL: http://arxiv.org/abs/2508.18192v1
Date: Mon, 25 Aug 2025 16:49:38 GMT
Title: Unraveling the cognitive patterns of Large Language Models through module communities
Authors: Kushal Raj Bhandari, Pin-Yu Chen, Jianxi Gao,
Abstract summary: Large Language Models (LLMs) have reshaped our world with significant advancements in science, engineering, and society.<n>Despite their ubiquity and utility, the underlying mechanisms of LLM remain concealed within billions of parameters and complex structures.<n>We address this gap by adopting approaches to understanding emerging cognition in biology.
Score: 45.399985422756224
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have reshaped our world with significant advancements in science, engineering, and society through applications ranging from scientific discoveries and medical diagnostics to Chatbots. Despite their ubiquity and utility, the underlying mechanisms of LLM remain concealed within billions of parameters and complex structures, making their inner architecture and cognitive processes challenging to comprehend. We address this gap by adopting approaches to understanding emerging cognition in biology and developing a network-based framework that links cognitive skills, LLM architectures, and datasets, ushering in a paradigm shift in foundation model analysis. The skill distribution in the module communities demonstrates that while LLMs do not strictly parallel the focalized specialization observed in specific biological systems, they exhibit unique communities of modules whose emergent skill patterns partially mirror the distributed yet interconnected cognitive organization seen in avian and small mammalian brains. Our numerical results highlight a key divergence from biological systems to LLMs, where skill acquisition benefits substantially from dynamic, cross-regional interactions and neural plasticity. By integrating cognitive science principles with machine learning, our framework provides new insights into LLM interpretability and suggests that effective fine-tuning strategies should leverage distributed learning dynamics rather than rigid modular interventions.

Related papers

Large language models for spreading dynamics in complex systems [15.581915022853337]
Spreading dynamics is a central topic in the physics of complex systems and network science.<n>Large language models (LLMs) have exhibited strong capabilities in natural language understanding, reasoning, and generation.<n>LLMs can act as interactive agents embedded in propagation systems, potentially influencing spreading pathways and feedback structures.
arXiv Detail & Related papers (2026-02-08T18:58:43Z)
Large Language Models as Model Organisms for Human Associative Learning [9.196745903193609]
We adapt a cognitive neuroscience associative learning paradigm and investigate how representations evolve across six models.<n>Our initial findings reveal a non-monotonic pattern consistent with the Non-Monotonic Plasticity Hypothesis.<n>We find that higher vocabulary interference amplifies differentiation, suggesting that representational change is influenced by both item similarity and global competition.
arXiv Detail & Related papers (2025-10-24T12:52:11Z)
Fundamentals of Building Autonomous LLM Agents [64.39018305018904]
This paper reviews the architecture and implementation methods of agents powered by large language models (LLMs)<n>The research aims to explore patterns to develop "agentic" LLMs that can automate complex tasks and bridge the performance gap with human capabilities.
arXiv Detail & Related papers (2025-10-10T10:32:39Z)
Lilith: Developmental Modular LLMs with Chemical Signaling [49.1574468325115]
Current paradigms in Artificial Intelligence rely on layers of feedforward networks which model brain activity at the neuronal level.<n>We propose LILITH, a novel architecture that combines developmental training of modular language models with brain-inspired token-based communication protocols.
arXiv Detail & Related papers (2025-07-06T23:18:51Z)
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z)
Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models [45.05285463251872]
We introduce a novel learning paradigm -- Modular Machine Learning (MML) -- as an essential approach toward new-generation large language models (LLMs)<n>MML decomposes the complex structure of LLMs into three interdependent components: modular representation, modular model, and modular reasoning.<n>We present a feasible implementation of MML-based LLMs via leveraging advanced techniques such as disentangled representation learning, neural architecture search and neuro-symbolic learning.
arXiv Detail & Related papers (2025-04-28T17:42:02Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
Interactive Continual Learning: Fast and Slow Thinking [19.253164551254734]
This paper presents a novel Interactive Continual Learning framework, enabled by collaborative interactions among models of various sizes. To improve memory retrieval in System1, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution. Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods.
arXiv Detail & Related papers (2024-03-05T03:37:28Z)
Synergistic Integration of Large Language Models and Cognitive Architectures for Robust AI: An Exploratory Analysis [12.9222727028798]
This paper explores the integration of two AI subdisciplines employed in the development of artificial agents that exhibit intelligent behavior: Large Language Models (LLMs) and Cognitive Architectures (CAs) We present three integration approaches, each grounded in theoretical models and supported by preliminary empirical evidence. These approaches aim to harness the strengths of both LLMs and CAs, while mitigating their weaknesses, thereby advancing the development of more robust AI systems.
arXiv Detail & Related papers (2023-08-18T21:42:47Z)
Self-organizing Democratized Learning: Towards Large-scale Distributed Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems. Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper. The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.