Strategic resource allocation in memory encoding: An efficiency principle shaping language processing
- URL: http://arxiv.org/abs/2503.14728v2
- Date: Fri, 29 Aug 2025 17:15:35 GMT
- Title: Strategic resource allocation in memory encoding: An efficiency principle shaping language processing
- Authors: Weijie Xu, Richard Futrell,
- Abstract summary: We propose Strategic Resource Allocation as an efficiency principle for memory encoding in sentence processing.<n>From a resource-rational perspective, we argue that SRA is the principled solution to a computational problem posed by two functional assumptions about working memory.<n>One of the critical consequences of SRA is that surprising inputs are encoded with enhanced representations, and therefore are less susceptible to memory decay and interference.
- Score: 6.307485015636125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How is the limited capacity of working memory efficiently used to support human linguistic behaviors? In this paper, we propose Strategic Resource Allocation (SRA) as an efficiency principle for memory encoding in sentence processing. The idea is that working memory resources are dynamically and strategically allocated to prioritize novel and unexpected information. From a resource-rational perspective, we argue that SRA is the principled solution to a computational problem posed by two functional assumptions about working memory, namely its limited capacity and its noisy representation. Specifically, working memory needs to minimize the retrieval error of past inputs under the constraint of limited memory resources, an optimization problem whose solution is to allocate more resources to encode more surprising inputs with higher precision. One of the critical consequences of SRA is that surprising inputs are encoded with enhanced representations, and therefore are less susceptible to memory decay and interference. Empirically, through naturalistic corpus data, we find converging evidence for SRA in the context of dependency locality from both production and comprehension, where non-local dependencies with less predictable antecedents are associated with reduced locality effect. However, our results also reveal considerable cross-linguistic variability, suggesting the need for a closer examination of how SRA, as a domain-general memory efficiency principle, interacts with language-specific phrase structures. SRA highlights the critical role of representational uncertainty in understanding memory encoding. It also reimages the effects of surprisal and entropy on processing difficulty from the perspective of efficient memory encoding.
Related papers
- Understanding LoRA as Knowledge Memory: An Empirical Analysis [20.53732426953178]
This work investigates a parametric approach using Low-Rank Adaptation (LoRA) as a modular knowledge memory.<n>We bridge this gap through the first systematic empirical study mapping the design space of LoRA-based memory.<n>Our findings position LoRA as the complementary axis of memory alongside RAG and ICL, offering distinct advantages.
arXiv Detail & Related papers (2026-03-01T13:28:57Z) - HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling [7.24393498822329]
HyMem is a hybrid memory architecture that enables dynamic on-demand scheduling through multi-granular memory representations.<n>We show that HyMem achieves strong performance on both the LOCOMO and LongMemEval benchmarks, outperforming full-context while reducing computational cost by 92.6%.
arXiv Detail & Related papers (2026-02-15T00:06:19Z) - Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management [49.71055327567513]
We argue that memory management should be viewed as a sequential decision-making problem under uncertainty.<n>Our contribution is not a new algorithm, but a principled reframing that clarifies the limitations of approaches.
arXiv Detail & Related papers (2025-12-25T08:23:03Z) - MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z) - Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization'
This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z) - Cognitive Memory in Large Language Models [8.059261857307881]
This paper examines memory mechanisms in Large Language Models (LLMs), emphasizing their importance for context-rich responses, reduced hallucinations, and improved efficiency.<n>It categorizes memory into sensory, short-term, and long-term, with sensory memory corresponding to input prompts, short-term memory processing immediate context, and long-term memory implemented via external databases or structures.
arXiv Detail & Related papers (2025-04-03T09:58:19Z) - COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs [77.79640601822341]
Large Language Models (LLMs) have demonstrated remarkable success across various domains.<n>Their optimization remains a significant challenge due to the complex and high-dimensional loss landscapes they inhabit.
arXiv Detail & Related papers (2025-02-24T18:42:19Z) - Structured Token Retention and Computational Memory Paths in Large Language Models [0.0]
This paper introduces a probabilistic selection framework that dynamically adjusts token persistence based on contextual significance.<n>It is extended through hierarchical memory allocation, refining retention efficiency through structured reallocation of token embeddings.<n>The integration of STR and CMP into an open-source model illustrates the adaptability of structured memory retention methodologies.
arXiv Detail & Related papers (2025-02-05T11:59:22Z) - Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation [0.0]
This study introduces hierarchical embedding augmentation as a means to redefine the representation of tokens through multi-level semantic structures.<n>Results reveal substantial improvements in computational efficiency, with marked reductions in processing overhead for longer input sequences.<n>The ability to dynamically adjust token representations and memory configurations contributed to the model's robustness under varied and unpredictable input conditions.
arXiv Detail & Related papers (2025-01-23T22:20:36Z) - Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration [0.0]
This paper introduces an innovative approach to enhancing the architectural design of large-scale computational models through the dynamic segmentation of parameters into context-aware regions.<n> Experimental evaluations demonstrate substantial improvements in accuracy, perplexity, and contextual coherence across a variety of linguistic tasks.<n>The findings collectively demonstrate the potential for Contextual Partitioning to redefine the scalability and adaptability of computational language architectures in diverse and complex domains.
arXiv Detail & Related papers (2025-01-22T14:21:04Z) - Goal-oriented Communications based on Recursive Early Exit Neural Networks [14.538977446476684]
We introduce an innovative early exit strategy that dynamically partitions computations.<n>We develop a Reinforcement Learning-based online optimization framework that jointly determines early exit points, computation splitting, and offloading strategies.<n> Numerical evaluations in an edge inference scenario demonstrate the method's adaptability and effectiveness in striking an excellent trade-off between performance, latency, and resource efficiency.
arXiv Detail & Related papers (2024-12-27T11:14:11Z) - CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation [63.65323577445951]
We propose a novel approach called Cache Sparse Representation (CSR)<n>CSR transforms the dense Key-Value cache tensor into sparse indexes and weights, offering a more memory-efficient representation during LLM inference.<n>Our experiments demonstrate CSR achieves performance comparable to state-of-the-art KV cache quantization algorithms.
arXiv Detail & Related papers (2024-12-16T13:01:53Z) - Memory-Driven Metaheuristics: Improving Optimization Performance [0.0]
This chapter explores the significance of memory in metaheuristic algorithms.<n>The key factors influencing the effectiveness of memory mechanisms are discussed.<n>A comprehensive analysis of how memory mechanisms are incorporated into popular metaheuristic algorithms is presented.
arXiv Detail & Related papers (2024-11-07T13:27:03Z) - SMILE: Speech Meta In-Context Learning for Low-Resource Language Automatic Speech Recognition [55.2480439325792]
Speech Meta In-Context LEarning (SMILE) is an innovative framework that combines meta-learning with speech in-context learning (SICL)<n>We show that SMILE consistently outperforms baseline methods in training-free few-shot multilingual ASR tasks.
arXiv Detail & Related papers (2024-09-16T16:04:16Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z) - Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem.
We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z) - Performance Optimization for Semantic Communications: An Attention-based
Reinforcement Learning Approach [187.4094332217186]
A semantic communication framework is proposed for textual data transmission.
A metric of semantic similarity (MSS) that jointly captures the semantic accuracy and completeness of the recovered text is proposed.
arXiv Detail & Related papers (2022-08-17T11:39:16Z) - Representation Learning for Resource-Constrained Keyphrase Generation [78.02577815973764]
We introduce salient span recovery and salient span prediction as guided denoising language modeling objectives.
We show the effectiveness of the proposed approach for low-resource and zero-shot keyphrase generation.
arXiv Detail & Related papers (2022-03-15T17:48:04Z) - Deep Learning-based Resource Allocation For Device-to-Device
Communication [66.74874646973593]
We propose a framework for the optimization of the resource allocation in multi-channel cellular systems with device-to-device (D2D) communication.
A deep learning (DL) framework is proposed, where the optimal resource allocation strategy for arbitrary channel conditions is approximated by deep neural network (DNN) models.
Our simulation results confirm that near-optimal performance can be attained with low time, which underlines the real-time capability of the proposed scheme.
arXiv Detail & Related papers (2020-11-25T14:19:23Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Deep Learning-based Resource Allocation for Infrastructure Resilience [0.5249805590164901]
Decision-makers can use our trained models to allocate resources more efficiently after contingencies.
We showcase our methodology by the real-world interdependent infrastructure of Shelby County, TN.
arXiv Detail & Related papers (2020-07-12T00:48:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.