A Generative AI System for Biomedical Data Discovery with Grammar-Based Visualizations
- URL: http://arxiv.org/abs/2509.16454v1
- Date: Fri, 19 Sep 2025 22:20:24 GMT
- Title: A Generative AI System for Biomedical Data Discovery with Grammar-Based Visualizations
- Authors: Devin Lange, Shanghua Gao, Pengwei Sui, Austen Money, Priya Misner, Marinka Zitnik, Nils Gehlenborg,
- Abstract summary: We explore the potential for combining generative AI with grammar-based visualizations for biomedical data discovery.<n>In our prototype, we use a multi-agent system to generate visualization specifications and apply filters.<n>These visualizations are linked together, resulting in an interactive dashboard that is progressively constructed.
- Score: 27.577426841656788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We explore the potential for combining generative AI with grammar-based visualizations for biomedical data discovery. In our prototype, we use a multi-agent system to generate visualization specifications and apply filters. These visualizations are linked together, resulting in an interactive dashboard that is progressively constructed. Our system leverages the strengths of natural language while maintaining the utility of traditional user interfaces. Furthermore, we utilize generated interactive widgets enabling user adjustment. Finally, we demonstrate the potential utility of this system for biomedical data discovery with a case study.
Related papers
- YAC: Bridging Natural Language and Interactive Visual Exploration with Generative AI for Biomedical Data Discovery [27.577426841656788]
We bridge the gap between natural language and interactive visualizations by generating structured declarative output with a multi-agent system.<n>We include widgets, which allow users to adjust the values of that structured output through user interface elements.
arXiv Detail & Related papers (2025-09-23T15:57:42Z) - A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI [70.06771291117965]
We introduce Biomedica, an open-source dataset derived from the PubMed Central Open Access subset.<n>Biomedica contains over 6 million scientific articles and 24 million image-text pairs.<n>We provide scalable streaming and search APIs through a web server, facilitating seamless integration with AI systems.
arXiv Detail & Related papers (2025-03-26T05:56:46Z) - Multimodal Contrastive Representation Learning in Augmented Biomedical Knowledge Graphs [2.006175707670159]
PrimeKG++ is an enriched knowledge graph incorporating multimodal data.<n>Our approach demonstrates strong generalizability, enabling accurate link predictions even for unseen nodes.
arXiv Detail & Related papers (2025-01-03T05:29:12Z) - EndToEndML: An Open-Source End-to-End Pipeline for Machine Learning Applications [0.2826977330147589]
We propose a web-based end-to-end pipeline that is capable of preprocessing, training, evaluating, and visualizing machine learning models.
Our library assists in recognizing, classifying, clustering, and predicting a wide range of multi-modal, multi-sensor datasets.
arXiv Detail & Related papers (2024-03-27T02:24:38Z) - Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning [64.1316997189396]
We present a novel language-tied self-supervised learning framework, Hierarchical Language-tied Self-Supervision (HLSS) for histopathology images.
Our resulting model achieves state-of-the-art performance on two medical imaging benchmarks, OpenSRH and TCGA datasets.
arXiv Detail & Related papers (2024-03-21T17:58:56Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - GraphPrompt: Graph-Based Prompt Templates for Biomedical Synonym
Prediction [12.604871572399722]
We introduce an expert-curated dataset OBO-syn encompassing 70 different types of concepts and 2 million curated concept-term pairs for evaluating synonym prediction methods.
We propose GraphPrompt, a prompt-based learning approach that creates prompt templates according to the graphs.
We envision that our method GraphPrompt and OBO-syn dataset can be applied broadly to graph-based NLP tasks, and serve as the basis for analyzing diverse and accumulating biomedical data.
arXiv Detail & Related papers (2021-11-13T06:59:27Z) - GenNI: Human-AI Collaboration for Data-Backed Text Generation [102.08127062293111]
Table2Text systems generate textual output based on structured data utilizing machine learning.
GenNI (Generation Negotiation Interface) is an interactive visual system for high-level human-AI collaboration in producing descriptive text.
arXiv Detail & Related papers (2021-10-19T18:07:07Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Unsupervised Multi-Modal Representation Learning for Affective Computing
with Multi-Corpus Wearable Data [16.457778420360537]
We propose an unsupervised framework to reduce the reliance on human supervision.
The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram (ECG) and electrodermal activity (EDA) signals.
Our method outperforms current state-of-the-art results that have performed arousal detection on the same datasets.
arXiv Detail & Related papers (2020-08-24T22:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.