Related papers: GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction

GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction

URL: http://arxiv.org/abs/2505.10939v2
Date: Mon, 04 Aug 2025 10:29:40 GMT
Title: GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction
Authors: Mohammadtaha Bagherifard, Sahar Rajabi, Ali Edalat, Yadollah Yaghoobzadeh,
Abstract summary: We propose a modular framework that disentangles the entanglement of general knowledge and task-specific adaptations.<n>By subtracting this general knowledge component from each task-specific module, we obtain residual modules that focus more exclusively on task-relevant information.<n>Our studies on the Phi-3 model and standard Arrow as baselines reveal that using general knowledge LoRAs yields consistent performance gains in both monolingual and cross-lingual settings.
Score: 5.2078428584067815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models often struggle with zero-shot generalization, and several modular approaches have been proposed to address this challenge. Yet, we hypothesize that a key limitation remains: the entanglement of general knowledge and task-specific adaptations. To overcome this, we propose a modular framework that disentangles these components by constructing a library of task-specific LoRA modules alongside a general-domain LoRA. By subtracting this general knowledge component from each task-specific module, we obtain residual modules that focus more exclusively on task-relevant information, a method we call general knowledge subtraction (GenKnowSub). Leveraging the refined task-specific modules and the Arrow routing algorithm \citep{ostapenko2024towards}, we dynamically select and combine modules for new inputs without additional training. Our studies on the Phi-3 model and standard Arrow as baselines reveal that using general knowledge LoRAs derived from diverse languages, including English, French, and German, yields consistent performance gains in both monolingual and cross-lingual settings across a wide set of benchmarks. Further experiments on Phi-2 demonstrate how GenKnowSub generalizes to weaker LLMs. The complete code and data are available at https://github.com/saharsamr/Modular-LLM.

Related papers

GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models [75.25348392263676]
Open-domain question answering (OpenQA) represents a cornerstone in natural language processing (NLP)<n>We propose a novel framework named GenKI, which aims to improve the OpenQA performance by exploring Knowledge Integration and controllable Generation.
arXiv Detail & Related papers (2025-05-26T08:18:33Z)
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs [54.59207567677249]
Large language models (LLMs) still struggle across tasks outside of high-resource languages.<n>In this work, we investigate cross-lingual transfer to lower-resource languages where task-specific post-training data is scarce.
arXiv Detail & Related papers (2025-05-23T20:28:31Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Knowledge Plugins: Enhancing Large Language Models for Domain-Specific Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE. This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z)
Agent Lumos: Unified and Modular Training for Open-Source Language Agents [89.78556964988852]
We introduce LUMOS, one of the first frameworks for training open-source LLM-based agents. LUMOS features a learnable, unified, and modular architecture with a planning module that learns high-level subgoal generation. We collect large-scale, unified, and high-quality training annotations derived from diverse ground-truth reasoning rationales.
arXiv Detail & Related papers (2023-11-09T00:30:13Z)
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules [51.82044734879657]
We propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions. We find that CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.
arXiv Detail & Related papers (2023-10-13T10:17:48Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models [46.079902719883414]
We propose Knowledge Card, a modular framework to plug in new factual and relevant knowledge into general-purpose language models. We first introduce knowledge cards -- specialized language models trained on corpora from specific domains and sources. We then propose three content selectors to dynamically select and retain information in documents generated by knowledge cards.
arXiv Detail & Related papers (2023-05-17T05:25:27Z)
Continual Learning via Local Module Composition [11.380264053565082]
Local module composition (LMC) is an approach to modular continual learning. LMC provides each module a local structural component that estimates a module's relevance to the input.
arXiv Detail & Related papers (2021-11-15T13:34:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.