Supporting Experts with a Multimodal Machine-Learning-Based Tool for
Human Behavior Analysis of Conversational Videos
- URL: http://arxiv.org/abs/2402.11145v1
- Date: Sat, 17 Feb 2024 00:27:04 GMT
- Title: Supporting Experts with a Multimodal Machine-Learning-Based Tool for
Human Behavior Analysis of Conversational Videos
- Authors: Riku Arakawa and Kiyosu Maeda and Hiromu Yakura
- Abstract summary: We developed Providence, a visual-programming-based tool based on design considerations derived from a formative study with experts.
It enables experts to combine various machine learning algorithms to capture human behavioral cues without writing code.
Our study showed its preferable usability and satisfactory output with less cognitive load imposed in accomplishing scene search tasks of conversations.
- Score: 40.30407535831779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal scene search of conversations is essential for unlocking valuable
insights into social dynamics and enhancing our communication. While experts in
conversational analysis have their own knowledge and skills to find key scenes,
a lack of comprehensive, user-friendly tools that streamline the processing of
diverse multimodal queries impedes efficiency and objectivity. To solve it, we
developed Providence, a visual-programming-based tool based on design
considerations derived from a formative study with experts. It enables experts
to combine various machine learning algorithms to capture human behavioral cues
without writing code. Our study showed its preferable usability and
satisfactory output with less cognitive load imposed in accomplishing scene
search tasks of conversations, verifying the importance of its customizability
and transparency. Furthermore, through the in-the-wild trial, we confirmed the
objectivity and reusability of the tool transform experts' workflow, suggesting
the advantage of expert-AI teaming in a highly human-contextual domain.
Related papers
- PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation [8.313693615194309]
In this work, we introduce PromptHive, a collaborative interface for prompt authoring, designed to better connect domain knowledge with prompt engineering.
We conducted an evaluation study with ten subject matter experts in math and validated our design through two collaborative prompt-writing sessions and a learning gain study with 358 learners.
Our results elucidate the prompt iteration process and validate the tool's usability, enabling non-AI experts to craft prompts that generate content comparable to human-authored materials.
arXiv Detail & Related papers (2024-10-21T22:18:24Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Re-mine, Learn and Reason: Exploring the Cross-modal Semantic
Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task.
We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z) - An Interdisciplinary Perspective on Evaluation and Experimental Design
for Visual Text Analytics: Position Paper [24.586485898038312]
In this paper, we focus on the issues of evaluating visual text analytics approaches.
We identify four key groups of challenges for evaluating visual text analytics approaches.
arXiv Detail & Related papers (2022-09-23T11:47:37Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Estimating Presentation Competence using Multimodal Nonverbal Behavioral
Cues [7.340483819263093]
Public speaking and presentation competence plays an essential role in many areas of social interaction.
One approach that can promote efficient development of presentation competence is the automated analysis of human behavior during a speech.
In this work, we investigate the contribution of different nonverbal behavioral cues, namely, facial, body pose-based, and audio-related features, to estimate presentation competence.
arXiv Detail & Related papers (2021-05-06T13:09:41Z) - On Interactive Machine Learning and the Potential of Cognitive Feedback [2.320417845168326]
We introduce interactive machine learning and explain its advantages and limitations within the context of defense applications.
We define the three techniques by which cognitive feedback may be employed: self reporting, implicit cognitive feedback, and modeled cognitive feedback.
arXiv Detail & Related papers (2020-03-23T16:28:14Z) - A Review on Intelligent Object Perception Methods Combining
Knowledge-based Reasoning and Machine Learning [60.335974351919816]
Object perception is a fundamental sub-field of Computer Vision.
Recent works seek ways to integrate knowledge engineering in order to expand the level of intelligence of the visual interpretation of objects.
arXiv Detail & Related papers (2019-12-26T13:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.