Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale
Annotations
- URL: http://arxiv.org/abs/2402.19133v1
- Date: Thu, 29 Feb 2024 13:09:26 GMT
- Title: Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale
Annotations
- Authors: Stephanie Brandl, Oliver Eberle, Tiago Ribeiro, Anders S{\o}gaard,
Nora Hollenstein
- Abstract summary: We debate whether human gaze, in the form of webcam-based eye-tracking recordings, poses a valid alternative when evaluating importance scores.
We evaluate the additional information provided by gaze data, such as total reading times, gaze entropy, and decoding accuracy.
Our findings suggest that gaze data offers valuable linguistic insights that could be leveraged to infer task difficulty.
- Score: 14.915881495753121
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rationales in the form of manually annotated input spans usually serve as
ground truth when evaluating explainability methods in NLP. They are, however,
time-consuming and often biased by the annotation process. In this paper, we
debate whether human gaze, in the form of webcam-based eye-tracking recordings,
poses a valid alternative when evaluating importance scores. We evaluate the
additional information provided by gaze data, such as total reading times, gaze
entropy, and decoding accuracy with respect to human rationale annotations. We
compare WebQAmGaze, a multilingual dataset for information-seeking QA, with
attention and explainability-based importance scores for 4 different
multilingual Transformer-based language models (mBERT, distil-mBERT, XLMR, and
XLMR-L) and 3 languages (English, Spanish, and German). Our pipeline can easily
be applied to other tasks and languages. Our findings suggest that gaze data
offers valuable linguistic insights that could be leveraged to infer task
difficulty and further show a comparable ranking of explainability methods to
that of human rationales.
Related papers
- A Comparative Analysis of Conversational Large Language Models in
Knowledge-Based Text Generation [5.661396828160973]
We conduct an empirical analysis of conversational large language models in generating natural language text from semantic triples.
We compare four large language models of varying sizes with different prompting techniques.
Our findings show that the capabilities of large language models in triple verbalization can be significantly improved through few-shot prompting, post-processing, and efficient fine-tuning techniques.
arXiv Detail & Related papers (2024-02-02T15:26:39Z) - BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer [1.911678487931003]
Retrieval-based language models are increasingly employed in question-answering tasks.
We develop the first Norwegian retrieval-based model by adapting the REALM framework.
We show that this type of training improves the reader's performance on extractive question-answering.
arXiv Detail & Related papers (2023-04-19T13:40:47Z) - A Comparative Study on Textual Saliency of Styles from Eye Tracking,
Annotations, and Language Models [21.190423578990824]
We present eyeStyliency, an eye-tracking dataset for human processing of stylistic text.
We develop a variety of methods to derive style saliency scores over text using the collected eye dataset.
We find that while eye-tracking data is unique, it also intersects with both human annotations and model-based importance scores.
arXiv Detail & Related papers (2022-12-19T21:50:36Z) - Retrieval-based Disentangled Representation Learning with Natural
Language Supervision [61.75109410513864]
We present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish intrinsic dimensions that capture characteristics within data through its natural language counterpart, thus disentanglement.
arXiv Detail & Related papers (2022-12-15T10:20:42Z) - Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods [6.018950511093273]
Saliency maps can explain a neural model's predictions by identifying important input features.
We formalize the underexplored task of translating saliency maps into natural language.
We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
arXiv Detail & Related papers (2022-10-13T17:48:15Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - RuMedBench: A Russian Medical Language Understanding Benchmark [58.99199480170909]
The paper describes the open Russian medical language understanding benchmark covering several task types.
We prepare the unified format labeling, data split, and evaluation metrics for new tasks.
A single-number metric expresses a model's ability to cope with the benchmark.
arXiv Detail & Related papers (2022-01-17T16:23:33Z) - Does Summary Evaluation Survive Translation to Other Languages? [0.0]
We translate an existing English summarization dataset, SummEval dataset, to four different languages.
We analyze the scores from the automatic evaluation metrics in translated languages, as well as their correlation with human annotations in the source language.
arXiv Detail & Related papers (2021-09-16T17:35:01Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and
Event Extraction [107.8262586956778]
We introduce graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations.
GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree.
We propose to utilize the self-attention mechanism to learn the dependencies between words with different syntactic distances.
arXiv Detail & Related papers (2020-10-06T20:30:35Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.