Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
- URL: http://arxiv.org/abs/2502.05233v1
- Date: Fri, 07 Feb 2025 04:24:07 GMT
- Title: Efficient Knowledge Feeding to Language Models: A Novel Integrated Encoder-Decoder Architecture
- Authors: S Santosh Kumar, Rishi Gottimukkala, Supriya Devidutta, Karthikeyan S,
- Abstract summary: ICV recasts in-context learning by using latent embeddings of language models.
ICV directly integrates information into the model, enabling it to process this information more effectively.
- Score: 0.0
- License:
- Abstract: This paper introduces a novel approach to efficiently feeding knowledge to language models (LLMs) during prediction by integrating retrieval and generation processes within a unified framework. While the Retrieval-Augmented Generation (RAG) model addresses gaps in LLMs' training data and knowledge limits, it is hindered by token limit restrictions and dependency on the retrieval system's accuracy. Our proposed architecture incorporates in-context vectors (ICV) to overcome these challenges. ICV recasts in-context learning by using latent embeddings of LLMs to create a vector that captures essential task information. This vector is then used to shift the latent states of the LLM, enhancing the generation process without adding demonstration examples to the prompt. ICV directly integrates information into the model, enabling it to process this information more effectively. Our extensive experimental evaluation demonstrates that ICV outperforms standard in-context learning and fine-tuning across question-answering, information retrieval, and other tasks. This approach mitigates the limitations of current RAG models and offers a more robust solution for handling extensive and diverse datasets. Despite leveraging a fraction of the parameters, our ICV-enhanced model achieves competitive performance against models like LLaMA-3, Gemma, and Phi-3, significantly reducing computational costs and memory requirements. ICV reduces prompt length, is easy to control, surpasses token limitations, and is computationally efficient compared to fine-tuning.
Related papers
- DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.
We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.
Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.
We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.
We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - Visual RAG: Expanding MLLM visual knowledge without fine-tuning [5.341192792319891]
This paper introduces Visual RAG, that synergically combines the MLLMs capability to learn from the context, with a retrieval mechanism.
In this way, the resulting system is not limited to the knowledge extracted from the training data, but can be updated rapidly and easily without fine-tuning.
It greatly reduces the computational costs for improving the model image classification performance, and augments the model knowledge to new visual domains and tasks it was not trained for.
arXiv Detail & Related papers (2025-01-18T17:43:05Z) - Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.
LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.
Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z) - CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept [5.345828824625758]
We propose a novel amortized unlearning approach using codebook features and Sparse Autoencoders (SAEs)
By leveraging a bottleneck to decompose the activation space and regulate information flow, our method efficiently unlearns targeted information while preserving the model's performance on unrelated data.
arXiv Detail & Related papers (2024-10-08T10:26:22Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Leveraging Large Language Models for Web Scraping [0.0]
This research investigates a general-purpose accurate data scraping recipe for RAG models designed for language generation.
To capture knowledge in a more modular and interpretable way, we use pre trained language models with a latent knowledge retriever.
arXiv Detail & Related papers (2024-06-12T14:15:15Z) - Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - CEM: A Data-Efficient Method for Large Language Models to Continue Evolving From Mistakes [36.14056870453356]
Continual learning is essential for keeping Large Language Models current and addressing their shortcomings.
We propose the Continue Evolving from Mistakes (CEM) method, a data-efficient approach aiming to collect CPT data.
Experiments show that CEM substantially enhances multiple models' performance on both in-domain and out-of-domain QA tasks, achieving gains of up to 29.63%.
arXiv Detail & Related papers (2024-04-11T17:44:56Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.