SymRAG: Efficient Neuro-Symbolic Retrieval Through Adaptive Query Routing
- URL: http://arxiv.org/abs/2506.12981v2
- Date: Sat, 12 Jul 2025 05:28:30 GMT
- Title: SymRAG: Efficient Neuro-Symbolic Retrieval Through Adaptive Query Routing
- Authors: Safayat Bin Hakim, Muhammad Adil, Alvaro Velasquez, Houbing Herbert Song,
- Abstract summary: Current Retrieval-Augmented Generation systems use uniform processing, causing inefficiency as simple queries consume resources similar to complex multi-hop tasks.<n>We present SymRAG, a framework that introduces adaptive query routing via real-time complexity and load assessment to select symbolic, neural, or hybrid pathways.
- Score: 8.775121469887033
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current Retrieval-Augmented Generation systems use uniform processing, causing inefficiency as simple queries consume resources similar to complex multi-hop tasks. We present SymRAG, a framework that introduces adaptive query routing via real-time complexity and load assessment to select symbolic, neural, or hybrid pathways. SymRAG's neuro-symbolic approach adjusts computational pathways based on both query characteristics and system load, enabling efficient resource allocation across diverse query types. By combining linguistic and structural query properties with system load metrics, SymRAG allocates resources proportional to reasoning requirements. Evaluated on 2,000 queries across HotpotQA (multi-hop reasoning) and DROP (discrete reasoning) using Llama-3.2-3B and Mistral-7B models, SymRAG achieves competitive accuracy (97.6--100.0% exact match) with efficient resource utilization (3.6--6.2% CPU utilization, 0.985--3.165s processing). Disabling adaptive routing increases processing time by 169--1151%, showing its significance for complex models. These results suggest adaptive computation strategies are more sustainable and scalable for hybrid AI systems that use dynamic routing and neuro-symbolic frameworks.
Related papers
- Relatron: Automating Relational Machine Learning over Relational Databases [50.94254514286021]
We present a study that unifies RDL and DFS in a shared design space and conducts architecture-centric searches across diverse RDB tasks.<n>Our analysis yields three key findings: (1) RDL does not consistently outperform DFS, with performance being highly task-dependent; (2) no single architecture dominates across tasks, underscoring the need for task-aware model selection; and accuracy is an unreliable guide for choice architecture.
arXiv Detail & Related papers (2026-02-26T02:45:22Z) - RAGRouter-Bench: A Dataset and Benchmark for Adaptive RAG Routing [37.7721677767453]
We introduce RAG-Bench, the first dataset and benchmark designed for adaptive RAG routing.<n>RAG-Bench revisits retrieval from a query-corpus compatibility perspective and standardizes five representative RAG paradigms for systematic evaluation.<n> Experiments with DeepSeek-V3 and LLaMA-3.1-8B demonstrate that no single RAG paradigm is universally optimal.
arXiv Detail & Related papers (2026-01-30T20:38:11Z) - Learning to Route: A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation [55.47971671635531]
Large Language Models (LLMs) have shown remarkable performance on general Question Answering (QA)<n>Retrieval-Augmented Generation (RAG) addresses this limitation by enriching LLMs with external knowledge.<n>Existing systems primarily rely on unstructured documents, while largely overlooking relational databases.
arXiv Detail & Related papers (2025-09-30T22:19:44Z) - RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory [57.449129198822476]
RCR is a role-aware context routing framework for multi-agent large language model (LLM) systems.<n>It dynamically selects semantically relevant memory subsets for each agent based on its role and task stage.<n>A lightweight scoring policy guides memory selection, and agent outputs are integrated into a shared memory store.
arXiv Detail & Related papers (2025-08-06T21:59:34Z) - Dynamically Adaptive Reasoning via LLM-Guided MCTS for Efficient and Context-Aware KGQA [6.765017336265049]
This paper proposes Dynamically Adaptive MCTS-based Reasoning (DAMR) for Knowledge Graph Question Answering (KGQA)<n>DAMR integrates symbolic search with adaptive path evaluation for efficient and context-aware KGQA.<n>Experiments on multiple KGQA benchmarks show DAMR significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-08-01T15:38:21Z) - LTRR: Learning To Rank Retrievers for LLMs [53.285436927963865]
We show that routing-based RAG systems can outperform the best single-retriever-based systems.<n>Performance gains are especially pronounced in models trained with the Answer Correctness (AC) metric.<n>As part of the SIGIR 2025 LiveRAG challenge, our submitted system demonstrated the practical viability of our approach.
arXiv Detail & Related papers (2025-06-16T17:53:18Z) - Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning [60.84901522792042]
Multimodal Retrieval-Augmented Generation (MRAG) has shown promise in mitigating hallucinations in Multimodal Large Language Models (MLLMs)<n>We propose R1, a novel MRAG framework that learns to decide when and where to retrieve knowledge based on the evolving reasoning state.<n>R1- can adaptively and effectively leverage diverse KBs, reducing unnecessary retrievals and improving both efficiency and accuracy.
arXiv Detail & Related papers (2025-05-28T08:17:57Z) - Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks [10.562940259841623]
This paper presents a novel approach for enhancing Large Language Models (LLMs) in knowledge-intensive QA tasks.<n>The proposed system includes an automated QA generator and a model fine-tuner, evaluated using perplexity, ROUGE, BLEU, and BERTScore.<n>Experiments demonstrate improvements in logical coherence and factual accuracy, with implications for developing adaptable Artificial Intelligence (AI) systems.
arXiv Detail & Related papers (2025-05-20T11:16:29Z) - Efficient and Scalable Neural Symbolic Search for Knowledge Graph Complex Query Answering [50.1887329564127]
We propose an efficient and scalable symbolic search framework for complex queries.<n>Our framework reduces the computational load of symbolic methods by 90% while maintaining nearly the same performance.
arXiv Detail & Related papers (2025-05-13T01:24:09Z) - ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity.<n>This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics.<n>Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z) - Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z) - Advancing Spatio-Temporal Processing in Spiking Neural Networks through Adaptation [6.233189707488025]
neural networks on neuromorphic hardware promise orders of less power consumption than their non-spiking counterparts.<n>Standard neuron model for spike-based computation on such systems has long been the integrate-and-fire (LIF) neuron.<n>The root of these so-called adaptive LIF neurons is not well understood.
arXiv Detail & Related papers (2024-08-14T12:49:58Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Spatial-temporal-demand clustering for solving large-scale vehicle
routing problems with time windows [0.0]
We propose a decompose-route-improve (DRI) framework that groups customers using clustering.
Its similarity metric incorporates customers' spatial, temporal, and demand data.
We apply pruned local search (LS) between solved subproblems to improve the overall solution.
arXiv Detail & Related papers (2024-01-20T06:06:01Z) - Evolving Connectivity for Recurrent Spiking Neural Networks [8.80300633999542]
Recurrent neural networks (RSNNs) hold great potential for advancing artificial general intelligence.
We propose the evolving connectivity (EC) framework, an inference-only method for training RSNNs.
arXiv Detail & Related papers (2023-05-28T07:08:25Z) - MISNN: Multiple Imputation via Semi-parametric Neural Networks [9.594714330925703]
Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research.
We propose MISNN, a novel and efficient algorithm that incorporates feature selection for MI.
arXiv Detail & Related papers (2023-05-02T21:45:36Z) - Learning to Schedule Heuristics for the Simultaneous Stochastic
Optimization of Mining Complexes [2.538209532048867]
The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm.
L2P is tested on several real-world mining complexes, with an emphasis on efficiency, robustness, and generalization capacity.
Results show a reduction in the number of iterations by 30-50% and in the computational time by 30-45%.
arXiv Detail & Related papers (2022-02-25T18:20:14Z) - DistIR: An Intermediate Representation and Simulator for Efficient
Neural Network Distribution [15.086401550425125]
DistIR is a representation for distributed computation that is tailored for efficient analyses.
We show how DistIR and its simulator enable fast grid searches over complex distribution spaces spanning up to 1000+ configurations.
arXiv Detail & Related papers (2021-11-09T21:32:51Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments
using A3C learning and Residual Recurrent Neural Networks [30.61220416710614]
A-Advantage-Actor-Critic (A3C) learning is known to quickly adapt to dynamic scenarios with less data and Residual Recurrent Neural Network (R2N2) to quickly update model parameters.
We use the R2N2 architecture to capture a large number of host and task parameters together with temporal patterns to provide efficient scheduling decisions.
Experiments conducted on real-world data set show a significant improvement in terms of energy consumption, response time, ServiceLevelAgreement and running cost by 14.4%, 7.74%, 31.9%, and 4.64%, respectively.
arXiv Detail & Related papers (2020-09-01T13:36:34Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z) - RIS Enhanced Massive Non-orthogonal Multiple Access Networks: Deployment
and Passive Beamforming Design [116.88396201197533]
A novel framework is proposed for the deployment and passive beamforming design of a reconfigurable intelligent surface (RIS)
The problem of joint deployment, phase shift design, as well as power allocation is formulated for maximizing the energy efficiency.
A novel long short-term memory (LSTM) based echo state network (ESN) algorithm is proposed to predict users' tele-traffic demand by leveraging a real dataset.
A decaying double deep Q-network (D3QN) based position-acquisition and phase-control algorithm is proposed to solve the joint problem of deployment and design of the RIS.
arXiv Detail & Related papers (2020-01-28T14:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.