Related papers: Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset for Personality Assessment

Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset for Personality Assessment

URL: http://arxiv.org/abs/2008.13769v1
Date: Mon, 31 Aug 2020 17:44:28 GMT
Title: Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset for Personality Assessment
Authors: Shahid Nawaz Khan, Maitree Leekha, Jainendra Shukla, Rajiv Ratn Shah
Abstract summary: We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv. It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation. The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
Score: 50.15466026089435
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatically detecting personality traits can aid several applications, such as mental health recognition and human resource management. Most datasets introduced for personality detection so far have analyzed these traits for each individual in isolation. However, personality is intimately linked to our social behavior. Furthermore, surprisingly little research has focused on personality analysis using low resource languages. To this end, we present a novel peer-to-peer Hindi conversation dataset- Vyaktitv. It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation. The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants. We release the dataset for public use, as well as perform preliminary statistical analysis along the different dimensions. Finally, we also discuss various other applications and tasks for which the dataset can be employed.

Related papers

A Computational Framework for Interpretable Text-Based Personality Assessment from Social Media [0.0]
This thesis presents two datasets -- MBTI9k and PANDORA -- collected from Reddit.<n>The PANDORA dataset contains 17 million comments from over 10,000 users.<n>In response, the SIMPA framework was developed - a computational framework for interpretable personality assessment.
arXiv Detail & Related papers (2025-10-03T08:36:36Z)
A Chinese Multi-label Affective Computing Dataset Based on Social Media Network Users [2.0209172586699173]
This study collected data from the major social media platform Weibo, screening 11,338 valid users from over 50,000 individuals with diverse MBTI personality labels. We compiled a multi-label Chinese affective computing dataset that integrates the same user's personality traits with six emotions and micro-emotions, each annotated with intensity levels. This dataset is designed to advance machine recognition of complex human emotions and provide data support for research in psychology, education, marketing, finance, and politics.
arXiv Detail & Related papers (2024-11-13T05:38:55Z)
Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues [63.936654900356004]
Personality recognition aims to identify the personality traits implied in user data such as dialogues and social media posts. We propose a novel task named Explainable Personality Recognition, aiming to reveal the reasoning process as supporting evidence of the personality trait.
arXiv Detail & Related papers (2024-09-29T14:41:43Z)
Personality Analysis for Social Media Users using Arabic language and its Effect on Sentiment Analysis [1.2903829793534267]
This study, explores the correlation between the use of Arabic language on twitter, personality traits and its impact on sentiment analysis. We indicated the personality traits of users based on the information extracted from their profile activities, and the content of their tweets. Our findings demonstrated that personality affect sentiment in social media.
arXiv Detail & Related papers (2024-07-08T18:27:54Z)
PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner. Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z)
Personality Detection and Analysis using Twitter Data [7.584657555037871]
We release the largest automatically curated dataset for the research community. This dataset has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task. We show how our intriguing analysis results often follow natural intuition.
arXiv Detail & Related papers (2023-09-11T14:39:04Z)
Speaker Profiling in Multiparty Conversations [31.518453682472575]
This research paper explores the task of Speaker Profiling in Conversations (SPC) The primary objective of SPC is to produce a summary of persona characteristics for each individual speaker present in a dialogue. To address the task of SPC, we have curated a new dataset named SPICE, which comes with specific labels.
arXiv Detail & Related papers (2023-04-18T08:04:46Z)
Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance. This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings. Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z)
CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset for Conversational AI [48.67259855309959]
Most existing datasets for conversational AI ignore human personalities and emotions. We propose CPED, a large-scale Chinese personalized and emotional dialogue dataset. CPED contains more than 12K dialogues of 392 speakers from 40 TV shows.
arXiv Detail & Related papers (2022-05-29T17:45:12Z)
Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS" Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z)
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset [21.140721329446595]
This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions. The dataset consists of 90.5 hours of socio-dyadic interactions among 147 participants distributed in 188 sessions. It includes self-demographic, self- and peer-reported personality, internal state, and relationship profiling from participants.
arXiv Detail & Related papers (2020-12-28T15:08:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.