Connecting Humanities and Social Sciences: Applying Language and Speech
Technology to Online Panel Surveys
- URL: http://arxiv.org/abs/2302.10593v1
- Date: Tue, 21 Feb 2023 10:52:15 GMT
- Title: Connecting Humanities and Social Sciences: Applying Language and Speech
Technology to Online Panel Surveys
- Authors: Henk van den Heuvel, Martijn Bentum, Simone Wills, Judith C. Koops
- Abstract summary: We explore the application of language and speech technology to open-ended questions in a Dutch panel survey.
In an experimental wave respondents could choose to answer open questions via speech or keyboard.
We report the errors the ASR system produces and investigate the impact of these errors on downstream analyses.
- Score: 2.0646127669654835
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we explore the application of language and speech technology
to open-ended questions in a Dutch panel survey. In an experimental wave
respondents could choose to answer open questions via speech or keyboard.
Automatic speech recognition (ASR) was used to process spoken responses. We
evaluated answers from these input modalities to investigate differences
between spoken and typed answers.We report the errors the ASR system produces
and investigate the impact of these errors on downstream analyses. Open-ended
questions give more freedom to answer for respondents, but entail a non-trivial
amount of work to analyse. We evaluated the feasibility of using
transformer-based models (e.g. BERT) to apply sentiment analysis and topic
modelling on the answers of open questions. A big advantage of
transformer-based models is that they are trained on a large amount of language
materials and do not necessarily need training on the target materials. This is
especially advantageous for survey data, which does not contain a lot of text
materials. We tested the quality of automatic sentiment analysis by comparing
automatic labeling with three human raters and tested the robustness of topic
modelling by comparing the generated models based on automatic and manually
transcribed spoken answers.
Related papers
- ExpertQA: Expert-Curated Questions and Attributed Answers [51.68314045809179]
We conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality.
We collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions.
The output of our analysis is ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.
arXiv Detail & Related papers (2023-09-14T16:54:34Z) - Promoting Open-domain Dialogue Generation through Learning Pattern
Information between Contexts and Responses [5.936682548344234]
This paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples.
We also design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies.
arXiv Detail & Related papers (2023-09-06T08:11:39Z) - Discourse Analysis via Questions and Answers: Parsing Dependency
Structures of Questions Under Discussion [57.43781399856913]
This work adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis.
We characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained questions.
We develop the first-of-its-kind QUD that derives a dependency structure of questions over full documents.
arXiv Detail & Related papers (2022-10-12T03:53:12Z) - Automatic Short Math Answer Grading via In-context Meta-learning [2.0263791972068628]
We study the problem of automatic short answer grading for students' responses to math questions.
We use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model.
Second, we use an in-context learning approach that provides scoring examples as input to the language model.
arXiv Detail & Related papers (2022-05-30T16:26:02Z) - Evaluating Mixed-initiative Conversational Search Systems via User
Simulation [9.066817876491053]
We propose a conversational User Simulator, called USi, for automatic evaluation of such search systems.
We show that responses generated by USi are both inline with the underlying information need and comparable to human-generated answers.
arXiv Detail & Related papers (2022-04-17T16:27:33Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling.
We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z) - A Wrong Answer or a Wrong Question? An Intricate Relationship between
Question Reformulation and Answer Selection in Conversational Question
Answering [15.355557454305776]
We show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon.
We present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets.
arXiv Detail & Related papers (2020-10-13T06:29:51Z) - Speaker Sensitive Response Evaluation Model [17.381658875470638]
We propose an automatic evaluation model based on the similarity of the generated response with the conversational context.
We learn the model parameters from an unlabeled conversation corpus.
We show that our model can be applied to movie dialogues without any additional training.
arXiv Detail & Related papers (2020-06-12T08:59:10Z) - Knowledgeable Dialogue Reading Comprehension on Key Turns [84.1784903043884]
Multi-choice machine reading comprehension (MRC) requires models to choose the correct answer from candidate options given a passage and a question.
Our research focuses dialogue-based MRC, where the passages are multi-turn dialogues.
It suffers from two challenges, the answer selection decision is made without support of latently helpful commonsense, and the multi-turn context may hide considerable irrelevant information.
arXiv Detail & Related papers (2020-04-29T07:04:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.