Characterizing Datasets for Social Visual Question Answering, and the
New TinySocial Dataset
- URL: http://arxiv.org/abs/2010.11997v1
- Date: Thu, 8 Oct 2020 03:20:23 GMT
- Title: Characterizing Datasets for Social Visual Question Answering, and the
New TinySocial Dataset
- Authors: Zhanwen Chen, Shiyao Li, Roxanne Rashedi, Xiaoman Zi, Morgan
Elrod-Erickson, Bryan Hollis, Angela Maliakal, Xinyu Shen, Simeng Zhao,
Maithilee Kunda
- Abstract summary: Social intelligence includes the ability to watch videos and answer questions about social and theory-of-mind-related content.
Social visual question answering (social VQA) is emerging as a valuable methodology for studying social reasoning in both humans and AI agents.
We discuss methods for creating and characterizing social VQA datasets.
- Score: 0.7313653675718068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern social intelligence includes the ability to watch videos and answer
questions about social and theory-of-mind-related content, e.g., for a scene in
Harry Potter, "Is the father really upset about the boys flying the car?"
Social visual question answering (social VQA) is emerging as a valuable
methodology for studying social reasoning in both humans (e.g., children with
autism) and AI agents. However, this problem space spans enormous variations in
both videos and questions. We discuss methods for creating and characterizing
social VQA datasets, including 1) crowdsourcing versus in-house authoring,
including sample comparisons of two new datasets that we created
(TinySocial-Crowd and TinySocial-InHouse) and the previously existing Social-IQ
dataset; 2) a new rubric for characterizing the difficulty and content of a
given video; and 3) a new rubric for characterizing question types. We close by
describing how having well-characterized social VQA datasets will enhance the
explainability of AI agents and can also inform assessments and educational
interventions for people.
Related papers
- SS-GEN: A Social Story Generation Framework with Large Language Models [87.11067593512716]
Children with Autism Spectrum Disorder (ASD) often misunderstand social situations and struggle to participate in daily routines.
Social Stories are traditionally crafted by psychology experts under strict constraints to address these challenges.
We propose textbfSS-GEN, a framework to generate Social Stories in real-time with broad coverage.
arXiv Detail & Related papers (2024-06-22T00:14:48Z) - From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition [59.57095498284501]
We propose a novel approach that recognizes textbfContextual textbfSocial textbfRelationships (textbfConSoR) from a social cognitive perspective.
We construct social-aware descriptive language prompts with social relationships for each image.
Impressively, ConSoR outperforms previous methods with a 12.2% gain on the People-in-Social-Context (PISC) dataset and a 9.8% increase on the People-in-Photo-Album (PIPA) benchmark.
arXiv Detail & Related papers (2024-06-12T16:02:28Z) - Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future [59.78608958395464]
We build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets.
Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects.
We show there is a need for multifaceted datasets, increased diversity in language and culture, more long-tailed social situations, and more interactive data in future social intelligence data efforts.
arXiv Detail & Related papers (2024-02-28T00:22:42Z) - DeSIQ: Towards an Unbiased, Challenging Benchmark for Social
Intelligence Understanding [60.84356161106069]
We study the soundness of Social-IQ, a dataset of multiple-choice questions on videos of complex social interactions.
Our analysis reveals that Social-IQ contains substantial biases, which can be exploited by a moderately strong language model.
We introduce DeSIQ, a new challenging dataset, constructed by applying simple perturbations to Social-IQ.
arXiv Detail & Related papers (2023-10-24T06:21:34Z) - Video Question Answering: Datasets, Algorithms and Challenges [99.9179674610955]
Video Question Answering (VideoQA) aims to answer natural language questions according to the given videos.
This paper provides a clear taxonomy and comprehensive analyses to VideoQA, focusing on the datasets, algorithms, and unique challenges.
arXiv Detail & Related papers (2022-03-02T16:34:09Z) - Semantic Categorization of Social Knowledge for Commonsense Question
Answering [13.343786884695323]
We propose to categorize the semantics needed for commonsense question answering tasks using the SocialIQA as an example.
Unlike previous work, we observe our models with semantic categorizations of social knowledge can achieve comparable performance with a relatively simple model.
arXiv Detail & Related papers (2021-09-11T02:56:14Z) - Detecting socially interacting groups using f-formation: A survey of
taxonomy, methods, datasets, applications, challenges, and future research
directions [3.995408039775796]
Social behavior is one of the most sought-after qualities that a robot can possess.
To possess such a quality, a robot needs to determine the formation of the group and then determine a position for itself.
We put forward a novel holistic survey framework combining all the possible concerns and modules relevant to this problem.
arXiv Detail & Related papers (2021-08-13T11:51:17Z) - SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement
Learning Agents [23.719833581321033]
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI.
We argue that aiming towards human-level AI requires a broader set of key social skills.
We present SocialAI, a benchmark to assess the acquisition of social skills of DRL agents.
arXiv Detail & Related papers (2021-07-02T10:39:18Z) - SocialAI 0.1: Towards a Benchmark to Stimulate Research on
Socio-Cognitive Abilities in Deep Reinforcement Learning Agents [23.719833581321033]
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI.
Current approaches focus on language as a communication tool in very simplified and non diverse social situations.
We argue that aiming towards human-level AI requires a broader set of key social skills.
arXiv Detail & Related papers (2021-04-27T14:16:29Z) - PHASE: PHysically-grounded Abstract Social Events for Machine Social
Perception [50.551003004553806]
We create a dataset of physically-grounded abstract social events, PHASE, that resemble a wide range of real-life social interactions.
Phase is validated with human experiments demonstrating that humans perceive rich interactions in the social events.
As a baseline model, we introduce a Bayesian inverse planning approach, SIMPLE, which outperforms state-of-the-art feed-forward neural networks.
arXiv Detail & Related papers (2021-03-02T18:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.