Related papers: Localist LLMs with Recruitment Learning

Localist LLMs with Recruitment Learning

URL: http://arxiv.org/abs/2510.17358v1
Date: Mon, 20 Oct 2025 09:58:34 GMT
Title: Localist LLMs with Recruitment Learning
Authors: Joachim Diederich,
Abstract summary: We present a novel framework for training large language models with continuously adjustable internal representations.<n>Key innovations are (1) a locality dial, that dynamically controls the degree of localization during both training and inference without requiring model retraining, and (2) an information-theoretic recruitment mechanism that adaptively allocates semantic blocks as needed.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel framework for training large language models with continuously adjustable internal representations that span the full spectrum from localist (interpretable, rule-based) to distributed (generalizable, efficient) encodings. The key innovations are (1) a locality dial, a tunable parameter that dynamically controls the degree of localization during both training and inference without requiring model retraining, (2) an information-theoretic recruitment mechanism that adaptively allocates semantic blocks as needed, eliminating the requirement for complete domain knowledge at initialization, and (3) a hierarchical recruitment framework that extends capacity allocation to entire specialized LLMs, enabling multi-granularity architectural adaptation. This is achieved through group sparsity penalties on attention mechanisms, information-theoretic anchor design, dynamic rule injection, and principled recruitment criteria based on penalized likelihood with explicit units. We provide rigorous mathematical results establishing explicit threshold conditions under which attention provably concentrates on semantically relevant blocks at stationary points, with exact bounds on attention entropy and pointer fidelity. The hierarchical recruitment mechanism provides convergence guarantees at both the block level (fine-grained, within-LLM) and the LLM level (coarse-grained, cross-domain), ensuring the system discovers semantic partitions that balance model complexity against data encoding efficiency. This framework enables practitioners to continuously interpolate between interpretable and high-performance modes while adapting architectural capacity at multiple granularities, supporting applications in regulated domains requiring both transparency and capability.

Related papers

Connecting the Dots: Training-Free Visual Grounding via Agentic Reasoning [63.109585527799005]
GroundingAgent is a visual grounding framework that operates without task-specific fine-tuning.<n>It achieves an average zero-shot grounding accuracy of 65.1 % on widely-used benchmarks.<n>It also offers strong interpretability, transparently illustrating each reasoning step.
arXiv Detail & Related papers (2025-11-24T03:11:08Z)
Progressive Localisation in Localist LLMs [0.0]
This paper demonstrates that progressive localization represents the optimal architecture for creating interpretable large language models (LLMs)<n>We investigate whether interpretability constraints can be aligned with natural semantic structure while being applied strategically across network depth.<n>We show that progressive semantic localization, combining semantic block with steep adaptive locality schedules, achieves near-baseline language modeling performance while providing interpretable attention patterns.
arXiv Detail & Related papers (2025-11-23T09:49:13Z)
Federated Attention: A Distributed Paradigm for Collaborative LLM Inference over Edge Networks [63.541114376141735]
Large language models (LLMs) are proliferating rapidly at the edge, delivering intelligent capabilities across diverse application scenarios.<n>However, their practical deployment in collaborative scenarios confronts fundamental challenges: privacy vulnerabilities, communication overhead, and computational bottlenecks.<n>We propose Federated Attention (FedAttn), which integrates the federated paradigm into the self-attention mechanism.
arXiv Detail & Related papers (2025-11-04T15:14:58Z)
Localist LLMs -- A Mathematical Framework for Dynamic Locality Control [0.0]
Key innovation is a locality dial, a tunable parameter that dynamically controls the degree of localization during both training and inference without requiring model retraining.<n>We prove that when group sparsity penalties exceed certain threshold values, the model's attention mechanisms concentrate on semantically relevant blocks, achieving low entropy and high fidelity with negligible error.
arXiv Detail & Related papers (2025-10-10T12:44:59Z)
ReaLM: Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models [18.720486146234077]
Large Language Models (LLMs) have emerged as a powerful paradigm for Knowledge Graph Completion (KGC)<n>We propose ReaLM, a novel and effective framework that bridges the gap between KG embeddings and LLM tokenization.<n>We show that ReaLM achieves state-of-the-art performance, confirming its effectiveness in aligning structured knowledge with large-scale language models.
arXiv Detail & Related papers (2025-10-10T04:36:13Z)
Diagnose, Localize, Align: A Full-Stack Framework for Reliable LLM Multi-Agent Systems under Instruction Conflicts [75.20929587906228]
Large Language Model (LLM)-powered multi-agent systems (MAS) have rapidly advanced collaborative reasoning, tool use, and role-specialized coordination in complex tasks.<n>However, reliability-critical deployment remains hindered by a systemic failure mode: hierarchical compliance under instruction conflicts.
arXiv Detail & Related papers (2025-09-27T08:43:34Z)
Semantic-Inductive Attribute Selection for Zero-Shot Learning [4.083977531653519]
We study two complementary feature-selection strategies for Zero-Shot Learning (ZSL)<n>The first adapts embedded feature selection to the demands of ZSL, turning model-driven rankings into meaningful semantic pruning.<n>The second leverages evolutionary computation to directly explore the space of attribute subsets more broadly.
arXiv Detail & Related papers (2025-09-26T15:14:36Z)
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [104.31926740841128]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z)
LLM-as-classifier: Semi-Supervised, Iterative Framework for Hierarchical Text Classification using Large Language Models [0.0]
Large Language Models (LLMs) have provided unprecedented capabilities for analyzing unstructured text data.<n>Standard fine-tuning approaches can be resource-intensive and often struggle with the dynamic nature of real-world data distributions.
arXiv Detail & Related papers (2025-08-22T15:47:17Z)
DART: Dual Adaptive Refinement Transfer for Open-Vocabulary Multi-Label Recognition [59.203152078315235]
Open-Vocabulary Multi-Label Recognition (OV-MLR) aims to identify multiple seen and unseen object categories within an image.<n> Vision-Language Pre-training models offer a strong open-vocabulary foundation, but struggle with fine-grained localization under weak supervision.<n>We propose the Dual Adaptive Refinement Transfer (DART) framework to overcome these limitations.
arXiv Detail & Related papers (2025-08-07T17:22:33Z)
Deformable Cluster Manipulation via Whole-Arm Policy Learning [27.54191389134963]
We propose a novel framework for learning model-free policies integrating two modalities: 3D point clouds and proprioceptive touch indicators.<n>Our reinforcement learning framework leverages a distributional state representation, aided by kernel mean embeddings, to achieve improved training efficiency and real-time inference.<n>We deploy the framework in a power line clearance scenario and observe that the agent generates creative strategies leveraging multiple arm links for de-occlusion.
arXiv Detail & Related papers (2025-07-22T23:58:30Z)
Benchmarking LLMs' Swarm intelligence [51.648605206159125]
Large Language Models (LLMs) show potential for complex reasoning, yet their capacity for emergent coordination in Multi-Agent Systems (MAS) remains largely unexplored.<n>We introduce SwarmBench, a novel benchmark designed to systematically evaluate tasks of LLMs acting as decentralized agents.<n>We propose metrics for coordination effectiveness and analyze emergent group dynamics.
arXiv Detail & Related papers (2025-05-07T12:32:01Z)
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms. Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z)
Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn. We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.