Related papers: Multilingual Persuasion Detection: Video Games as an Invaluable Data Source for NLP

Multilingual Persuasion Detection: Video Games as an Invaluable Data Source for NLP

URL: http://arxiv.org/abs/2207.04453v1
Date: Sun, 10 Jul 2022 12:38:02 GMT
Title: Multilingual Persuasion Detection: Video Games as an Invaluable Data Source for NLP
Authors: Teemu P\"oyh\"onen, Mika H\"am\"al\"ainen, Khalid Alnajjar
Abstract summary: We show the viability of this data in building a persuasion detection system using a natural language processing model called BERT. We believe that video games have a lot of unused potential as a datasource for a variety of NLP tasks.
Score: 0.6123324869194194
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Role-playing games (RPGs) have a considerable amount of text in video game dialogues. Quite often this text is semi-annotated by the game developers. In this paper, we extract a multilingual dataset of persuasive dialogue from several RPGs. We show the viability of this data in building a persuasion detection system using a natural language processing (NLP) model called BERT. We believe that video games have a lot of unused potential as a datasource for a variety of NLP tasks. The code and data described in this paper are available on Zenodo.

Related papers

Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions [55.2480439325792]
Role-playing games (RPG) are games in which players interact with one another to create narratives. This emerging form of shared narrative, primarily oral, is receiving increasing attention. In this paper, we aim to discover to what extent the language of Large Language Models (LLMs) exhibit oral or written features when asked to generate an RPG session.
arXiv Detail & Related papers (2025-03-26T15:10:47Z)
What if Red Can Talk? Dynamic Dialogue Generation Using Large Language Models [0.0]
We introduce a dialogue filler framework that utilizes large language models (LLMs) to generate dynamic and contextually appropriate character interactions. We test this framework within the environments of Final Fantasy VII Remake and Pokemon. This study aims to assist developers in crafting more nuanced filler dialogues, thereby enriching player immersion and enhancing the overall RPG experience.
arXiv Detail & Related papers (2024-07-29T19:12:18Z)
GENEVA: GENErating and Visualizing branching narratives using LLMs [15.43734266732214]
textbfGENEVA, a prototype tool, generates a rich narrative graph with branching and reconverging storylines. textbfGENEVA has the potential to assist in game development, simulations, and other applications with game-like properties.
arXiv Detail & Related papers (2023-11-15T18:55:45Z)
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models [55.2480439325792]
We propose a framework that approaches data augmentation based on deepfake audio. A dataset produced by Indians (in English) was selected, ensuring the presence of a single accent.
arXiv Detail & Related papers (2023-09-22T11:33:03Z)
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages. We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets. Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z)
FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information [75.201485544517]
We present FIREBALL, a large dataset containing nearly 25,000 unique sessions from real D&D gameplay on Discord with true game state info. We demonstrate that FIREBALL can improve natural language generation (NLG) by using Avrae state information.
arXiv Detail & Related papers (2023-05-02T15:36:10Z)
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining [65.30528567491984]
This paper proposes a method for zero-shot multilingual TTS using text-only data for the target language. The use of text-only data allows the development of TTS systems for low-resource languages. Evaluation results demonstrate highly intelligible zero-shot TTS with a character error rate of less than 12% for an unseen language.
arXiv Detail & Related papers (2023-01-30T00:53:50Z)
Video Games as a Corpus: Sentiment Analysis using Fallout New Vegas Dialog [1.9014535120129343]
We present a method for extracting a multilingual sentiment annotated dialog data set from Fallout New Vegas. The game has been translated into English, Spanish, German, French and Italian. We conduct experiments on multilingual, multilabel sentiment analysis on the extracted data set.
arXiv Detail & Related papers (2022-12-05T11:09:05Z)
A Snapshot into the Possibility of Video Game Machine Translation [0.0]
This article introduces some of the challenges of video game translation, some of the existing literature, as well as the systems and data sets used in this experiment. One such finding highlights the model's ability to learn typical rules and patterns of video game translations from English into French.
arXiv Detail & Related papers (2022-09-19T08:16:59Z)
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation [133.7313847857935]
Our study highlights how NLP methods can be adapted to thousands more languages that are under-served by current technology. For 19 under-represented languages across 3 tasks, our methods lead to consistent improvements of up to 5 and 15 points with and without extra monolingual text respectively.
arXiv Detail & Related papers (2022-03-17T16:48:22Z)
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding. COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z)
The Gutenberg Dialogue Dataset [1.90365714903665]
Current publicly available open-domain dialogue datasets offer a trade-off between quality and size. We build a high-quality dataset of 14.8M utterances in English, and smaller datasets in German, Dutch, Spanish, Portuguese, Italian, and Hungarian.
arXiv Detail & Related papers (2020-04-27T12:52:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.