Related papers: DFAMS: Dynamic-flow guided Federated Alignment based Multi-prototype Search

DFAMS: Dynamic-flow guided Federated Alignment based Multi-prototype Search

URL: http://arxiv.org/abs/2508.20353v2
Date: Tue, 14 Oct 2025 04:49:28 GMT
Title: DFAMS: Dynamic-flow guided Federated Alignment based Multi-prototype Search
Authors: Zhibang Yang, Xinke Jiang, Rihong Qiu, Ruiqing Li, Yihang Zhang, Yue Fang, Yongxin Xu, Hongxin Ding, Xu Chu, Junfeng Zhao, Yasha Wang,
Abstract summary: Federated Retrieval (FR) routes queries across multiple external knowledge sources, when necessary external knowledge is distributed.<n>We propose DFAMS, a novel framework that leverages DIF to identify latent query intents and construct semantically aligned knowledge partitions.<n> Experimental results show that DFAMS outperforms advanced FR methods by up to 14.37% in knowledge classification accuracy, 5.38% in retrieval recall, and 6.45% in downstream QA accuracy.
Score: 30.780731199184384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated Retrieval (FR) routes queries across multiple external knowledge sources, to mitigate hallucinations of LLMs, when necessary external knowledge is distributed. However, existing methods struggle to retrieve high-quality and relevant documents for ambiguous queries, especially in cross-domain scenarios, which significantly limits their effectiveness in supporting downstream generation tasks. Inspired by Dynamic Information Flow (DIF), we propose DFAMS, a novel framework that leverages DIF to identify latent query intents and construct semantically aligned knowledge partitions for accurate retrieval across heterogeneous sources. Specifically, DFAMS probes the DIF in LLMs by leveraging gradient signals from a few annotated queries and employing Shapley value-based attribution to trace neuron activation paths associated with intent recognition and subdomain boundary detection. Then, DFAMS leverages DIF to train an alignment module via multi-prototype contrastive learning, enabling fine-grained intra-source modeling and inter-source semantic alignment across knowledge bases. Experimental results across five benchmarks show that DFAMS outperforms advanced FR methods by up to 14.37\% in knowledge classification accuracy, 5.38\% in retrieval recall, and 6.45\% in downstream QA accuracy, demonstrating its effectiveness in complex FR scenarios. Our code are anonymous available at https://anonymous.4open.science/r/DFAMS/

Related papers

DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router [57.28685457991806]
DeepSieve is an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router.<n>Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design.
arXiv Detail & Related papers (2025-07-29T17:55:23Z)
Uncovering Bias Paths with LLM-guided Causal Discovery: An Active Learning and Dynamic Scoring Approach [1.5498930424110338]
Large Language Models (LLMs) offer a promising complement to statistical Causal Discovery (CD) approaches.<n> Ensuring fairness in machine learning requires understanding how sensitive attributes causally influence outcomes.<n>We propose a hybrid LLM-based framework for CD that extends a breadth-first search (BFS) strategy with active learning and dynamic scoring.
arXiv Detail & Related papers (2025-06-13T21:04:03Z)
Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning [60.84901522792042]
Multimodal Retrieval-Augmented Generation (MRAG) has shown promise in mitigating hallucinations in Multimodal Large Language Models (MLLMs)<n>We propose R1, a novel MRAG framework that learns to decide when and where to retrieve knowledge based on the evolving reasoning state.<n>R1- can adaptively and effectively leverage diverse KBs, reducing unnecessary retrievals and improving both efficiency and accuracy.
arXiv Detail & Related papers (2025-05-28T08:17:57Z)
Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z)
Unlocking the Capabilities of Large Vision-Language Models for Generalizable and Explainable Deepfake Detection [18.125287697902813]
Current Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in understanding multimodal data.<n>We present a novel framework that unlocks LVLMs' potential capabilities for deepfake detection.
arXiv Detail & Related papers (2025-03-19T03:20:03Z)
Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision [50.45597801390757]
Instruct-LF is a goal-oriented latent factor discovery system.<n>It integrates instruction-following ability with statistical models to handle noisy datasets.
arXiv Detail & Related papers (2025-02-21T02:03:08Z)
Parametric Retrieval Augmented Generation [32.29608109539912]
Parametric RAG is a new RAG paradigm that integrates external knowledge directly into the parameters of feed-forward networks.<n>It substantially enhances both the effectiveness and efficiency of knowledge augmentation in large language models.
arXiv Detail & Related papers (2025-01-27T10:04:49Z)
BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition [12.57768435856206]
We propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition.<n>We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans.<n>We utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities.
arXiv Detail & Related papers (2024-12-03T07:51:14Z)
GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input.<n>GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z)
Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering [2.6524539020042663]
Large Language Models (LLMs) frequently lack domain-specific knowledge and even fine-tuned models tend to hallucinate. We present a pipeline, 4StepFocus, and specifically a preprocessing step, that can substantially improve the answers of LLMs. The method narrows down potentially correct answers by triplets-based searches in a semi-structured knowledge base in a direct, traceable fashion.
arXiv Detail & Related papers (2024-09-01T22:43:27Z)
W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering [28.79851078451609]
We propose W-RAG, a method that draws weak training signals from the downstream task and fine-tunes the retriever to prioritize passages that most benefit the task.<n>We conduct comprehensive experiments across four publicly available OpenQA datasets to demonstrate that our approach enhances both retrieval and OpenQA performance.
arXiv Detail & Related papers (2024-08-15T22:34:44Z)
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications. FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z)
FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs [48.32113486904612]
We propose FKA-Owl, a framework that leverages forgery-specific knowledge to augment Large Vision-Language Models (LVLMs) Experiments on the public benchmark demonstrate that FKA-Owl achieves superior cross-domain performance compared to previous methods.
arXiv Detail & Related papers (2024-03-04T12:35:09Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
Merging Generated and Retrieved Knowledge for Open-Domain QA [72.42262579925911]
COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework. We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
arXiv Detail & Related papers (2023-10-22T19:37:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.