Related papers: BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions

BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions

URL: http://arxiv.org/abs/2506.05766v1
Date: Fri, 06 Jun 2025 05:48:22 GMT
Title: BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
Authors: Saptarshi Sengupta, Shuhua Yang, Paul Kwong Yu, Fali Wang, Suhang Wang,
Abstract summary: BioMol-MQA dataset is composed of two parts (i) a multimodal knowledge graph (KG) with text and molecular structure for information retrieval; and (ii) challenging questions designed to test LLM capabilities in retrieving and reasoning over multimodal KG to answer questions.<n>Our benchmarks indicate that existing LLMs struggle to answer these questions and do well only when given the necessary background data, signaling the necessity for strong RAG frameworks.
Score: 22.805931447412668
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval augmented generation (RAG) has shown great power in improving Large Language Models (LLMs). However, most existing RAG-based LLMs are dedicated to retrieving single modality information, mainly text; while for many real-world problems, such as healthcare, information relevant to queries can manifest in various modalities such as knowledge graph, text (clinical notes), and complex molecular structure. Thus, being able to retrieve relevant multi-modality domain-specific information, and reason and synthesize diverse knowledge to generate an accurate response is important. To address the gap, we present BioMol-MQA, a new question-answering (QA) dataset on polypharmacy, which is composed of two parts (i) a multimodal knowledge graph (KG) with text and molecular structure for information retrieval; and (ii) challenging questions that designed to test LLM capabilities in retrieving and reasoning over multimodal KG to answer questions. Our benchmarks indicate that existing LLMs struggle to answer these questions and do well only when given the necessary background data, signaling the necessity for strong RAG frameworks.

Related papers

LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge [0.03437656066916039]
Large language models (LLMs) can be used for information retrieval.<n> ensembles of zero-shot models can accomplish state-of-the-art performance on a domain-specific Yes/No QA task.
arXiv Detail & Related papers (2025-09-10T13:50:49Z)
MSRS: Evaluating Multi-Source Retrieval-Augmented Generation [51.717139132190574]
Many real-world applications demand the ability to integrate and summarize information scattered across multiple sources.<n>We present a scalable framework for constructing evaluation benchmarks that challenge RAG systems to integrate information across distinct sources.
arXiv Detail & Related papers (2025-08-28T14:59:55Z)
mKG-RAG: Multimodal Knowledge Graph-Enhanced RAG for Visual Question Answering [29.5761347590239]
Retrieval-Augmented Generation (RAG) has been proposed to expand internal knowledge of Multimodal Large Language Models (MLLMs)<n>In this paper, we propose a novel multimodal knowledge-augmented generation framework (mKG-RAG) based on multimodal KGs for knowledge-intensive VQA tasks.
arXiv Detail & Related papers (2025-08-07T12:22:50Z)
VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation [3.1033038923749774]
We propose the first concept-centric and knowledge-intensive multimodal knowledge graph that covers visual, audio, and text information.<n>Our construction pipeline ensures cross-modal knowledge alignment between multimodal data and fine-grained semantics.<n>We introduce a novel multimodal RAG framework that retrieves detailed concept-level knowledge in response to queries from arbitrary modalities.
arXiv Detail & Related papers (2025-06-11T07:22:57Z)
RAG-Enhanced Collaborative LLM Agents for Drug Discovery [28.025359322895905]
CLADD is a retrieval-augmented generation (RAG)-empowered agentic system tailored to drug discovery tasks.<n>We show that it outperforms general-purpose and domain-specific LLMs as well as traditional deep learning approaches.
arXiv Detail & Related papers (2025-02-22T00:12:52Z)
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [92.5712549836791]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)<n>We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z)
ELOQ: Resources for Enhancing LLM Detection of Out-of-Scope Questions [52.33835101586687]
We study out-of-scope questions, where the retrieved document appears semantically similar to the question but lacks the necessary information to answer it.<n>We propose a guided hallucination-based approach ELOQ to automatically generate a diverse set of out-of-scope questions from post-cutoff documents.
arXiv Detail & Related papers (2024-10-18T16:11:29Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
BioRAG: A RAG-LLM Framework for Biological Question Reasoning [14.05505988436551]
We introduce BioRAG, a novel Retrieval-Augmented Generation (RAG) with the Large Language Models (LLMs) framework. Our approach starts with parsing, indexing, and segmenting an extensive collection of 22 million scientific papers as the basic knowledge, followed by training a specialized embedding model tailored to this domain. For queries requiring the most current information, BioRAGs deconstruct the question and employs an iterative retrieval process incorporated with the search engine for step-by-step reasoning.
arXiv Detail & Related papers (2024-08-02T08:37:03Z)
Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions [42.73799041840482]
i-MedRAG is a system that iteratively asks follow-up queries based on previous information-seeking attempts. Our zero-shot i-MedRAG outperforms all existing prompt engineering and fine-tuning methods on GPT-3.5. i-MedRAG can flexibly ask follow-up queries to form reasoning chains, providing an in-depth analysis of medical questions.
arXiv Detail & Related papers (2024-08-01T17:18:17Z)
Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z)
Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering [55.295699268654545]
We propose a novel Chain-ofDiscussion framework to leverage the synergy among open-source Large Language Models.<n>Our experiments show that discussions among multiple LLMs play a vital role in enhancing the quality of answers.
arXiv Detail & Related papers (2024-02-26T05:31:34Z)
Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context [4.1229332722825]
This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement. We conduct experiments on various Large Language Models (LLMs) with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions. Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases.
arXiv Detail & Related papers (2024-01-23T11:25:34Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.