PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
- URL: http://arxiv.org/abs/2506.17310v1
- Date: Wed, 18 Jun 2025 09:17:06 GMT
- Title: PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
- Authors: Kangcong Li, Peng Ye, Chongjun Tu, Lin Zhang, Chunfeng Song, Jiamin Wu, Tao Yang, Qihao Zheng, Tao Chen,
- Abstract summary: We propose PaceLLM, featuring two innovations: (1) a Persistent Activity (PA) Mechanism that mimics prefrontal cortex (PFC) neurons' persistent firing by introducing an activation-level memory bank to dynamically retrieve, reuse, and update critical FFN states, addressing contextual decay; and (2) Cortical Expert (CE) Clustering that emulates task-adaptive neural specialization to reorganize FFN weights into semantic modules, establishing cross-token dependencies and mitigating fragmentation.
- Score: 20.849307413516183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain's working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent Activity (PA) Mechanism that mimics prefrontal cortex (PFC) neurons' persistent firing by introducing an activation-level memory bank to dynamically retrieve, reuse, and update critical FFN states, addressing contextual decay; and (2) Cortical Expert (CE) Clustering that emulates task-adaptive neural specialization to reorganize FFN weights into semantic modules, establishing cross-token dependencies and mitigating fragmentation. Extensive evaluations show that PaceLLM achieves 6% improvement on LongBench's Multi-document QA and 12.5-17.5% performance gains on Infinite-Bench tasks, while extending measurable context length to 200K tokens in Needle-In-A-Haystack (NIAH) tests. This work pioneers brain-inspired LLM optimization and is complementary to other works. Besides, it can be generalized to any model and enhance their long-context performance and interpretability without structural overhauls.
Related papers
- NeuroMambaLLM: Dynamic Graph Learning of fMRI Functional Connectivity in Autistic Brains Using Mamba and Language Model Reasoning [0.0]
We propose NeuroMambaLLM, an end-to-end framework that integrates dynamic latent graph learning and selective state-space temporal modelling with Large Language Models (LLMs)<n>The proposed method learns the functional connectivity dynamically from raw Blood-Oxygen-Level-Dependent (BOLD) time series, replacing fixed correlation graphs with adaptive latent connectivity while suppressing motion-related artifacts and capturing long-range temporal dependencies.<n>This design enables the LLM to perform both diagnostic classification and language-based reasoning, allowing it to analyze dynamic fMRI patterns and generate clinically meaningful textual reports.
arXiv Detail & Related papers (2026-02-14T13:32:59Z) - CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension [55.29309306566238]
Current Large Language Models (LLMs) are confronted with overwhelming information volume when comprehending long-form documents.<n>This challenge raises the imperative of a cohesive memory module, which can elevate vanilla LLMs into autonomous reading agents.<n>We draw inspiration from Jean Piaget's Constructivist Theory, illuminating three traits of the agentic memory -- structured schemata, flexible assimilation, and dynamic accommodation.
arXiv Detail & Related papers (2025-10-07T02:16:30Z) - CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling [52.05149789178508]
CCF is a novel context compression framework designed to enable efficient long-context modeling.<n>CCF integrates segment-wise semantic aggregation with key-value memory encoding, forming compact representations.<n> Empirical results on multiple long-context language modeling benchmarks demonstrate that CCF achieves competitive perplexity under high compression ratios.
arXiv Detail & Related papers (2025-09-11T07:13:49Z) - Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models [74.15250326312179]
Diffusion Large Language Models offer efficient parallel generation and capable global modeling.<n>The dominant application ofDLLMs is hindered by the need for a statically predefined generation length.<n>We introduce DAEDAL, a novel training-free denoising strategy that enables Dynamic Adaptive Length Expansion.
arXiv Detail & Related papers (2025-08-01T17:56:07Z) - MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents [84.62985963113245]
We introduce MEM1, an end-to-end reinforcement learning framework that enables agents to operate with constant memory across long multi-turn tasks.<n>At each turn, MEM1 updates a compact shared internal state that jointly supports memory consolidation and reasoning.<n>We show that MEM1-7B improves performance by 3.5x while reducing memory usage by 3.7x compared to Qwen2.5-14B-Instruct on a 16-objective multi-hop QA task.
arXiv Detail & Related papers (2025-06-18T19:44:46Z) - Activation-Guided Consensus Merging for Large Language Models [25.68958388022476]
We present textbfActivation-Guided textbfConsensus textbfMerging (textbfACM), a plug-and-play merging framework that determines layer-specific merging coefficients.<n>Experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods.
arXiv Detail & Related papers (2025-05-20T07:04:01Z) - Core Context Aware Transformers for Long Context Language Modeling [50.774702091154204]
We propose a plug-and-play Core Context Aware (CCA) Attention for efficient long-context modeling.<n>Our method automatically focuses and strengthens core context while diminishing redundancy during the learning process.<n>Our method is able to replace the self-attention module in existing Large Language Models with minimal fine-tuning cost.
arXiv Detail & Related papers (2024-12-17T01:54:08Z) - InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions [104.90258030688256]
This project introduces disentangled streaming perception, reasoning, and memory mechanisms, enabling real-time interaction with streaming video and audio input.<n>This project simulates human-like cognition, enabling multimodal large language models to provide continuous and adaptive service over time.
arXiv Detail & Related papers (2024-12-12T18:58:30Z) - Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula [23.071384759427072]
State space models (SSMs) offer advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts.<n>We propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture.
arXiv Detail & Related papers (2024-11-01T21:01:13Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Human-like Episodic Memory for Infinite Context LLMs [13.211261438927798]
Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts.
In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs.
EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement.
arXiv Detail & Related papers (2024-07-12T17:34:03Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.