Vital Insight: Assisting Experts' Sensemaking Process of Multi-modal Personal Tracking Data Using Visualization and LLM
- URL: http://arxiv.org/abs/2410.14879v1
- Date: Fri, 18 Oct 2024 21:56:35 GMT
- Title: Vital Insight: Assisting Experts' Sensemaking Process of Multi-modal Personal Tracking Data Using Visualization and LLM
- Authors: Jiachen Li, Justin Steinberg, Xiwen Li, Akshat Choube, Bingsheng Yao, Dakuo Wang, Elizabeth Mynatt, Varun Mishra,
- Abstract summary: Vital Insight is an evidence-based'sensemaking' system that combines direct representation and indirect inference through visualization and Large Language Models.
We evaluate Vital Insight in user testing sessions with 14 experts in multi-modal tracking, synthesize design implications, and develop an expert sensemaking model where they iteratively move between direct data representations and AI-supported inferences to explore, retrieve, question, and validate insights.
- Score: 25.264865296828116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Researchers have long recognized the socio-technical gaps in personal tracking research, where machines can never fully model the complexity of human behavior, making it only able to produce basic rule-based outputs or "black-box" results that lack clear explanations. Real-world deployments rely on experts for this complex translation from sparse data to meaningful insights. In this study, we consider this translation process from data to insights by experts as "sensemaking" and explore how HCI researchers can support it through Vital Insight, an evidence-based 'sensemaking' system that combines direct representation and indirect inference through visualization and Large Language Models. We evaluate Vital Insight in user testing sessions with 14 experts in multi-modal tracking, synthesize design implications, and develop an expert sensemaking model where they iteratively move between direct data representations and AI-supported inferences to explore, retrieve, question, and validate insights.
Related papers
- InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions [22.007942964950217]
We develop InterChat, a generative visual analytics system that combines direct manipulation of visual elements with natural language inputs.
This integration enables precise intent communication and supports progressive, visually driven exploratory data analyses.
arXiv Detail & Related papers (2025-03-06T05:35:19Z) - User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI)
It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - DISCOVER: A Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of Human Behaviour [6.716560115378451]
We introduce a modular, flexible, yet user-friendly software framework specifically developed to streamline computational-driven data exploration for human behavior analysis.
Our primary objective is to democratize access to advanced computational methodologies, thereby enabling researchers across disciplines to engage in detailed behavioral analysis without the need for extensive technical proficiency.
arXiv Detail & Related papers (2024-07-18T11:28:52Z) - Supporting Experts with a Multimodal Machine-Learning-Based Tool for
Human Behavior Analysis of Conversational Videos [40.30407535831779]
We developed Providence, a visual-programming-based tool based on design considerations derived from a formative study with experts.
It enables experts to combine various machine learning algorithms to capture human behavioral cues without writing code.
Our study showed its preferable usability and satisfactory output with less cognitive load imposed in accomplishing scene search tasks of conversations.
arXiv Detail & Related papers (2024-02-17T00:27:04Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - Lessons Learned from EXMOS User Studies: A Technical Report Summarizing
Key Takeaways from User Studies Conducted to Evaluate The EXMOS Platform [5.132827811038276]
Two user studies aimed at illuminating the influence of different explanation types on three key dimensions: trust, understandability, and model improvement.
Results show that global model-centric explanations alone are insufficient for effectively guiding users during the intricate process of data configuration.
We present essential implications for developing interactive machine-learning systems driven by explanations.
arXiv Detail & Related papers (2023-10-03T14:04:45Z) - Investigating Deep Neural Network Architecture and Feature Extraction
Designs for Sensor-based Human Activity Recognition [0.0]
In light of deep learning's proven effectiveness across various domains, numerous deep methods have been explored to tackle the challenges in activity recognition.
We investigate the performance of common deep learning and machine learning approaches as well as different training mechanisms.
Various feature representations extracted from the sensor time-series data and measure their effectiveness for the human activity recognition task.
arXiv Detail & Related papers (2023-09-26T14:55:32Z) - Towards A Unified Agent with Foundation Models [18.558328028366816]
We investigate how to embed and leverage such abilities in Reinforcement Learning (RL) agents.
We design a framework that uses language as the core reasoning tool, exploring how this enables an agent to tackle a series of fundamental RL challenges.
We demonstrate substantial performance improvements over baselines in exploration efficiency and ability to reuse data from offline datasets.
arXiv Detail & Related papers (2023-07-18T22:37:30Z) - LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset,
Framework, and Benchmark [81.42376626294812]
We present Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.
Our aim is to establish LAMM as a growing ecosystem for training and evaluating MLLMs.
We present a comprehensive dataset and benchmark, which cover a wide range of vision tasks for 2D and 3D vision.
arXiv Detail & Related papers (2023-06-11T14:01:17Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology.
We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table.
It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z) - The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations [0.0]
Machine learning (ML) models are nowadays used in complex applications in various domains, such as medicine, bioinformatics, and other sciences.
Due to their black box nature, however, it may sometimes be hard to understand and trust the results they provide.
This has increased the demand for reliable visualization tools related to enhancing trust in ML models.
We present a State-of-the-Art Report (STAR) on enhancing trust in ML models with the use of interactive visualization.
arXiv Detail & Related papers (2022-12-22T14:29:43Z) - Visual Auditor: Interactive Visualization for Detection and
Summarization of Model Biases [18.434430375939755]
As machine learning (ML) systems become increasingly widespread, it is necessary to audit these systems for biases prior to their deployment.
Recent research has developed algorithms for effectively identifying intersectional bias in the form of interpretable, underperforming subsets (or slices) of the data.
We propose Visual Auditor, an interactive visualization tool for auditing and summarizing model biases.
arXiv Detail & Related papers (2022-06-25T02:48:27Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models [76.48370548802464]
This paper focuses on conducting a series of analytical experiments to examine the relations between the multi-head self-attention and the final MRC system performance.
We discover that passage-to-question and passage understanding attentions are the most important ones in the question answering process.
Through comprehensive visualizations and case studies, we also observe several general findings on the attention maps, which can be helpful to understand how these models solve the questions.
arXiv Detail & Related papers (2021-08-26T04:23:57Z) - Deep Learning for Sensor-based Human Activity Recognition: Overview,
Challenges and Opportunities [52.59080024266596]
We present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition.
We first introduce the multi-modality of the sensory data and provide information for public datasets.
We then propose a new taxonomy to structure the deep methods by challenges.
arXiv Detail & Related papers (2020-01-21T09:55:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.