Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
- URL: http://arxiv.org/abs/2505.00455v1
- Date: Thu, 01 May 2025 11:10:17 GMT
- Title: Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
- Authors: Sungbok Shin, Hyeon Jeon, Sanghyun Hong, Niklas Elmqvist,
- Abstract summary: We present the Data Therapist, a web-based tool that helps domain experts externalize implicit knowledge through a mixed-initiative process.<n>The resulting structured knowledge base can inform both human and automated visualization design.
- Score: 17.006423792670414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective data visualization requires not only technical proficiency but also a deep understanding of the domain-specific context in which data exists. This context often includes tacit knowledge about data provenance, quality, and intended use, which is rarely explicit in the dataset itself. We present the Data Therapist, a web-based tool that helps domain experts externalize this implicit knowledge through a mixed-initiative process combining iterative Q&A with interactive annotation. Powered by a large language model, the system analyzes user-supplied datasets, prompts users with targeted questions, and allows annotation at varying levels of granularity. The resulting structured knowledge base can inform both human and automated visualization design. We evaluated the tool in a qualitative study involving expert pairs from Molecular Biology, Accounting, Political Science, and Usable Security. The study revealed recurring patterns in how experts reason about their data and highlights areas where AI support can improve visualization design.
Related papers
- Can AI Extract Antecedent Factors of Human Trust in AI? An Application of Information Extraction for Scientific Literature in Behavioural and Computer Sciences [9.563656421424728]
Trust in AI is where factors contributing to human trust in AI applications are studied.
With the input of domain experts, we create the first annotated English dataset in this domain.
We benchmark it with state-of-the-art methods using large language models in named entity and relation extraction.
Our results indicate that this problem requires supervised learning which may not be currently feasible with prompt-based LLMs.
arXiv Detail & Related papers (2024-12-16T00:02:38Z) - DSAI: Unbiased and Interpretable Latent Feature Extraction for Data-Centric AI [24.349800949355465]
Large language models (LLMs) often struggle to objectively identify latent characteristics in large datasets.<n>We propose Data Scientist AI (DSAI), a framework that enables unbiased and interpretable feature extraction.
arXiv Detail & Related papers (2024-12-09T08:47:05Z) - User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI)
It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z) - Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems [58.561904356651276]
We introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework to improve the semantic understanding of entities for Conversational recommender systems.
KERL uses a knowledge graph and a pre-trained language model to improve the semantic understanding of entities.
KERL achieves state-of-the-art results in both recommendation and response generation tasks.
arXiv Detail & Related papers (2023-12-18T06:41:23Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Understanding metric-related pitfalls in image analysis validation [59.15220116166561]
This work provides the first comprehensive common point of access to information on pitfalls related to validation metrics in image analysis.
Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy.
arXiv Detail & Related papers (2023-02-03T14:57:40Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Cognitive Computing to Optimize IT Services [0.0]
A Cognitive solution goes beyond the traditional structured data analysis by deep analyses of both structured and unstructured text.
In experiments, upto 18-25% of yearly ticket volume has been reduced using the proposed approach.
arXiv Detail & Related papers (2021-12-28T09:56:44Z) - Knowledge Graph Anchored Information-Extraction for Domain-Specific
Insights [1.6308268213252761]
We use a task-based approach for fulfilling specific information needs within a new domain.
A pipeline constructed of state of the art NLP technologies is used to automatically extract an instance level semantic structure.
arXiv Detail & Related papers (2021-04-18T19:28:10Z) - Reasoning over Vision and Language: Exploring the Benefits of
Supplemental Knowledge [59.87823082513752]
This paper investigates the injection of knowledge from general-purpose knowledge bases (KBs) into vision-and-language transformers.
We empirically study the relevance of various KBs to multiple tasks and benchmarks.
The technique is model-agnostic and can expand the applicability of any vision-and-language transformer with minimal computational overhead.
arXiv Detail & Related papers (2021-01-15T08:37:55Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.