Related papers: A Review on Large Language Models for Visual Analytics

A Review on Large Language Models for Visual Analytics

URL: http://arxiv.org/abs/2503.15176v1
Date: Wed, 19 Mar 2025 13:02:01 GMT
Title: A Review on Large Language Models for Visual Analytics
Authors: Navya Sonal Agarwal, Sanjay Kumar Sonbhadra,
Abstract summary: The paper outlines the theoretical underpinnings of visual analytics and the transformative potential of Large Language Models (LLMs)<n>The review further investigates how the synergy between LLMs and visual analytics enhances data interpretation, visualization techniques, and interactive exploration capabilities.<n>The paper discusses their functionalities, strengths, and limitations in supporting data exploration, visualization enhancement, automated reporting, and insight extraction.
Score: 0.2209921757303168
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper provides a comprehensive review of the integration of Large Language Models (LLMs) with visual analytics, addressing their foundational concepts, capabilities, and wide-ranging applications. It begins by outlining the theoretical underpinnings of visual analytics and the transformative potential of LLMs, specifically focusing on their roles in natural language understanding, natural language generation, dialogue systems, and text-to-media transformations. The review further investigates how the synergy between LLMs and visual analytics enhances data interpretation, visualization techniques, and interactive exploration capabilities. Key tools and platforms including LIDA, Chat2VIS, Julius AI, and Zoho Analytics, along with specialized multimodal models such as ChartLlama and CharXIV, are critically evaluated. The paper discusses their functionalities, strengths, and limitations in supporting data exploration, visualization enhancement, automated reporting, and insight extraction. The taxonomy of LLM tasks, ranging from natural language understanding (NLU), natural language generation (NLG), to dialogue systems and text-to-media transformations, is systematically explored. This review provides a SWOT analysis of integrating Large Language Models (LLMs) with visual analytics, highlighting strengths like accessibility and flexibility, weaknesses such as computational demands and biases, opportunities in multimodal integration and user collaboration, and threats including privacy concerns and skill degradation. It emphasizes addressing ethical considerations and methodological improvements for effective integration.

Related papers

InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions [22.007942964950217]
We develop InterChat, a generative visual analytics system that combines direct manipulation of visual elements with natural language inputs.<n>This integration enables precise intent communication and supports progressive, visually driven exploratory data analyses.
arXiv Detail & Related papers (2025-03-06T05:35:19Z)
A Survey on Large Language Models with some Insights on their Capabilities and Limitations [0.3222802562733786]
Large Language Models (LLMs) exhibit remarkable performance across various language-related tasks. LLMs have demonstrated emergent abilities extending beyond their core functions. This paper explores the foundational components, scaling mechanisms, and architectural strategies that drive these capabilities.
arXiv Detail & Related papers (2025-01-03T21:04:49Z)
VilBias: A Study of Bias Detection through Linguistic and Visual Cues , presenting Annotation Strategies, Evaluation, and Key Challenges [2.2751168722976587]
VLBias is a framework that leverages state-of-the-art Large Language Models (LLMs) and Vision-Language Models (VLMs) to detect linguistic and visual biases in news content.<n>We present a multimodal dataset comprising textual content and corresponding images from diverse news sources.
arXiv Detail & Related papers (2024-12-22T15:05:30Z)
A Comprehensive Survey and Guide to Multimodal Large Language Models in Vision-Language Tasks [5.0453036768975075]
Large language models (MLLMs) integrate text, images, video and audio to enable AI systems for cross-modal understanding and generation.<n>Book examines prominent MLLM implementations while addressing key challenges in scalability, robustness, and cross-modal learning.<n>Concluding with a discussion of ethical considerations, responsible AI development, and future directions, this authoritative resource provides both theoretical frameworks and practical insights.
arXiv Detail & Related papers (2024-11-09T20:56:23Z)
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems. This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z)
A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges [60.546677053091685]
Large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. We explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation. We highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications.
arXiv Detail & Related papers (2024-06-15T16:11:35Z)
RelationVLM: Making Large Vision-Language Models Understand Visual Relations [66.70252936043688]
We present RelationVLM, a large vision-language model capable of comprehending various levels and types of relations whether across multiple images or within a video. Specifically, we devise a multi-stage relation-aware training scheme and a series of corresponding data configuration strategies to bestow RelationVLM with the capabilities of understanding semantic relations.
arXiv Detail & Related papers (2024-03-19T15:01:19Z)
The Revolution of Multimodal Large Language Models: A Survey [46.84953515670248]
Multimodal Large Language Models (MLLMs) can seamlessly integrate visual and textual modalities. This paper provides a review of recent visual-based MLLMs, analyzing their architectural choices, multimodal alignment strategies, and training techniques.
arXiv Detail & Related papers (2024-02-19T19:01:01Z)
Building Trust in Conversational AI: A Comprehensive Review and Solution Architecture for Explainable, Privacy-Aware Systems using LLMs and Knowledge Graph [0.33554367023486936]
We introduce a comprehensive tool that provides an in-depth review of over 150 Large Language Models (LLMs) Building on this foundation, we propose a novel functional architecture that seamlessly integrates the structured dynamics of Knowledge Graphs with the linguistic capabilities of LLMs. Our architecture adeptly blends linguistic sophistication with factual rigour and further strengthens data security through Role-Based Access Control.
arXiv Detail & Related papers (2023-08-13T22:47:51Z)
Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.