Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device
- URL: http://arxiv.org/abs/2502.15134v1
- Date: Fri, 21 Feb 2025 01:28:12 GMT
- Title: Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device
- Authors: Juntae Lee, Jihwan Bang, Seunghan Yang, Kyuhong Shim, Simyung Chang,
- Abstract summary: Chain of Rank (CoR) shifts the focus from intricate lengthy reasoning to simple ranking of the reliability of input external documents.<n>We attain the state-of-the-art (SOTA) results in benchmarks, and analyze its efficacy.
- Score: 20.666893617591136
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-augmented generation (RAG) with large language models (LLMs) is especially valuable in specialized domains, where precision is critical. To more specialize the LLMs into a target domain, domain-specific RAG has recently been developed by allowing the LLM to access the target domain early via finetuning. The domain-specific RAG makes more sense in resource-constrained environments like edge devices, as they should perform a specific task (e.g. personalization) reliably using only small-scale LLMs. While the domain-specific RAG is well-aligned with edge devices in this respect, it often relies on widely-used reasoning techniques like chain-of-thought (CoT). The reasoning step is useful to understand the given external knowledge, and yet it is computationally expensive and difficult for small-scale LLMs to learn it. Tackling this, we propose the Chain of Rank (CoR) which shifts the focus from intricate lengthy reasoning to simple ranking of the reliability of input external documents. Then, CoR reduces computational complexity while maintaining high accuracy, making it particularly suited for resource-constrained environments. We attain the state-of-the-art (SOTA) results in benchmarks, and analyze its efficacy.
Related papers
- FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation [1.8816124486165122]
FineScope is a framework for deriving domain-optimized language models from larger pretrained models.
We apply structured pruning with domain-specific constraints, ensuring that the resulting models retain essential knowledge for the target domain.
Experiments and ablation studies demonstrate that FineScope achieves highly competitive performance.
arXiv Detail & Related papers (2025-05-01T16:05:08Z) - RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization [53.63439735067081]
Large language models (LLMs) have achieved impressive performance but face high computational costs and latency.<n>Retrieval-augmented generation (RAG) helps by integrating external knowledge, but imperfect retrieval can introduce distracting noise that misleads SLMs.<n>We propose RoseRAG, a robust RAG framework for SLMs via Margin-aware Preference Optimization.
arXiv Detail & Related papers (2025-02-16T04:56:53Z) - LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing [70.35888047551643]
We present LaRA, a novel benchmark specifically designed to rigorously compare RAG and LC LLMs.<n>LaRA encompasses 2,326 test cases across four practical QA task categories and three types of naturally occurring long texts.<n>We find that the optimal choice between RAG and LC depends on a complex interplay of factors, including the model's parameter size, long-text capabilities, context length, task type, and the characteristics of the retrieved chunks.
arXiv Detail & Related papers (2025-02-14T08:04:22Z) - BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition [12.57768435856206]
We propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition.<n>We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans.<n>We utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities.
arXiv Detail & Related papers (2024-12-03T07:51:14Z) - Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization [7.522493227357079]
Large Language Models (LLMs) are pre-trained on large-scale corpora.
LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions.
We introduce SMART-SLIC, a highly domain-specific LLM framework.
arXiv Detail & Related papers (2024-10-03T17:40:55Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.<n>We devise a series of experiments to explain the performance gap empirically.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - Learning to Discover Knowledge: A Weakly-Supervised Partial Domain Adaptation Approach [20.899013563493202]
Domain adaptation has shown appealing performance by leveraging knowledge from a source domain with rich annotations.
For a specific target task, it is cumbersome to collect related and high-quality source domains.
In this paper, we propose a simple yet effective domain adaptation approach, termed as self-paced transfer classifier learning (SP-TCL)
arXiv Detail & Related papers (2024-06-20T12:54:07Z) - R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models [51.468732121824125]
Large language models have achieved remarkable success on general NLP tasks, but they may fall short for domain-specific problems.
Existing evaluation tools only provide a few baselines and evaluate them on various domains without mining the depth of domain knowledge.
In this paper, we address the challenges of evaluating RALLMs by introducing the R-Eval toolkit, a Python toolkit designed to streamline the evaluation of different RAGs.
arXiv Detail & Related papers (2024-06-17T15:59:49Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering [115.72130322143275]
REAR is a RElevance-Aware Retrieval-augmented approach for open-domain question answering (QA)
We develop a novel architecture for LLM-based RAG systems, by incorporating a specially designed assessment module.
Experiments on four open-domain QA tasks show that REAR significantly outperforms previous a number of competitive RAG approaches.
arXiv Detail & Related papers (2024-02-27T13:22:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.