Linguistic Cues of Deception in a Multilingual April Fools' Day Context
- URL: http://arxiv.org/abs/2111.03913v2
- Date: Tue, 9 Nov 2021 09:44:03 GMT
- Title: Linguistic Cues of Deception in a Multilingual April Fools' Day Context
- Authors: Katerina Papantoniou, Panagiotis Papadakos, Giorgos Flouris, Dimitris
Plexousakis
- Abstract summary: We introduce a corpus that includes diachronic AFD and normal articles from Greek newspapers and news websites.
We build a rich linguistic feature set, and analyze and compare its deception cues with the only AFD collection currently available.
- Score: 0.8487852486413651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we consider the collection of deceptive April Fools' Day(AFD)
news articles as a useful addition in existing datasets for deception detection
tasks. Such collections have an established ground truth and are relatively
easy to construct across languages. As a result, we introduce a corpus that
includes diachronic AFD and normal articles from Greek newspapers and news
websites. On top of that, we build a rich linguistic feature set, and analyze
and compare its deception cues with the only AFD collection currently
available, which is in English. Following a current research thread, we also
discuss the individualism/collectivism dimension in deception with respect to
these two datasets. Lastly, we build classifiers by testing various monolingual
and crosslingual settings. The results showcase that AFD datasets can be
helpful in deception detection studies, and are in alignment with the
observations of other deception detection works.
Related papers
- Simple Yet Effective Neural Ranking and Reranking Baselines for
Cross-Lingual Information Retrieval [50.882816288076725]
Cross-lingual information retrieval is the task of searching documents in one language with queries in another.
We provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold.
We implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese.
arXiv Detail & Related papers (2023-04-03T14:17:00Z) - Learning Object-Language Alignments for Open-Vocabulary Object Detection [83.09560814244524]
We propose a novel open-vocabulary object detection framework directly learning from image-text pair data.
It enables us to train an open-vocabulary object detector on image-text pairs in a much simple and effective way.
arXiv Detail & Related papers (2022-11-27T14:47:31Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Models and Datasets for Cross-Lingual Summarisation [78.56238251185214]
We present a cross-lingual summarisation corpus with long documents in a source language associated with multi-sentence summaries in a target language.
The corpus covers twelve language pairs and directions for four European languages, namely Czech, English, French and German.
We derive cross-lingual document-summary instances from Wikipedia by combining lead paragraphs and articles' bodies from language aligned Wikipedia titles.
arXiv Detail & Related papers (2022-02-19T11:55:40Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z) - A Massively Multilingual Analysis of Cross-linguality in Shared
Embedding Space [61.18554842370824]
In cross-lingual language models, representations for many different languages live in the same space.
We compute a task-based measure of cross-lingual alignment in the form of bitext retrieval performance.
We examine a range of linguistic, quasi-linguistic, and training-related features as potential predictors of these alignment metrics.
arXiv Detail & Related papers (2021-09-13T21:05:37Z) - Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with
Synthetic Data [2.225882303328135]
We propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parsing task.
Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems.
arXiv Detail & Related papers (2021-09-09T14:51:11Z) - Transferring Knowledge Distillation for Multilingual Social Event
Detection [42.663309895263666]
Recently published graph neural networks (GNNs) show promising performance at social event detection tasks.
We present a GNN that incorporates cross-lingual word embeddings for detecting events in multilingual data streams.
Experiments on both synthetic and real-world datasets show the framework to be highly effective at detection in both multilingual data and in languages where training samples are scarce.
arXiv Detail & Related papers (2021-08-06T12:38:42Z) - I Wish I Would Have Loved This One, But I Didn't -- A Multilingual
Dataset for Counterfactual Detection in Product Reviews [19.533526638034047]
We consider the problem of counterfactual detection (CFD) in product reviews.
For this purpose, we annotate a multilingual CFD dataset from Amazon product reviews.
The dataset is unique as it contains counterfactuals in multiple languages.
arXiv Detail & Related papers (2021-04-14T14:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.