Fool Me Twice: Entailment from Wikipedia Gamification
- URL: http://arxiv.org/abs/2104.04725v1
- Date: Sat, 10 Apr 2021 09:58:40 GMT
- Title: Fool Me Twice: Entailment from Wikipedia Gamification
- Authors: Julian Martin Eisenschlos, Bhuwan Dhingra, Jannis Bulian, Benjamin
B\"orschinger, Jordan Boyd-Graber
- Abstract summary: Gamification encourages adversarial examples, drastically lowering the number of examples that can be solved.
We release FoolMeTwice, a dataset of challenging entailment pairs collected through a fun multi-player game.
- Score: 12.071302977728221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We release FoolMeTwice (FM2 for short), a large dataset of challenging
entailment pairs collected through a fun multi-player game. Gamification
encourages adversarial examples, drastically lowering the number of examples
that can be solved using "shortcuts" compared to other popular entailment
datasets. Players are presented with two tasks. The first task asks the player
to write a plausible claim based on the evidence from a Wikipedia page. The
second one shows two plausible claims written by other players, one of which is
false, and the goal is to identify it before the time runs out. Players "pay"
to see clues retrieved from the evidence pool: the more evidence the player
needs, the harder the claim. Game-play between motivated players leads to
diverse strategies for crafting claims, such as temporal inference and
diverting to unrelated evidence, and results in higher quality data for the
entailment and evidence retrieval tasks. We open source the dataset and the
game code.
Related papers
- Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain.
We propose an adversarial algorithm to make the retriever component robust against distribution shift.
We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z) - Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z) - JECC: Commonsense Reasoning Tasks Derived from Interactive Fictions [75.42526766746515]
We propose a new commonsense reasoning dataset based on human's Interactive Fiction (IF) gameplay walkthroughs.
Our dataset focuses on the assessment of functional commonsense knowledge rules rather than factual knowledge.
Experiments show that the introduced dataset is challenging to previous machine reading models as well as the new large language models.
arXiv Detail & Related papers (2022-10-18T19:20:53Z) - Combining Sequential and Aggregated Data for Churn Prediction in Casual
Freemium Games [0.0]
In freemium games, the revenue from a player comes from the in-app purchases made and the advertisement to which that player is exposed.
Within this scenario, it is extremely important to be able to detect promptly when a player is about to quit playing.
We investigate how to improve the current state-of-the-art in churn prediction by combining sequential and aggregate data.
arXiv Detail & Related papers (2022-09-06T14:49:18Z) - Efficient tracking of team sport players with few game-specific
annotations [1.052782170493037]
We propose a new generic method to track team sport players during a full game thanks to few human annotations collected via a semi-interactive system.
Non-ambiguous tracklets and their appearance features are automatically generated with a detection and a reidentification network both pre-trained on public datasets.
We demonstrate the efficiency of our approach on a challenging rugby sevens dataset.
arXiv Detail & Related papers (2022-04-08T13:11:30Z) - Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games.
The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns.
We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z) - An Instance-Dependent Analysis for the Cooperative Multi-Player
Multi-Armed Bandit [93.97385339354318]
We study the problem of information sharing and cooperation in Multi-Player Multi-Armed bandits.
First, we show that a simple modification to a successive elimination strategy can be used to allow the players to estimate their suboptimality gaps.
Second, we leverage the first result to design a communication protocol that successfully uses the small reward of collisions to coordinate among players.
arXiv Detail & Related papers (2021-11-08T23:38:47Z) - 6MapNet: Representing soccer players from tracking data by a triplet
network [19.343859572602558]
We build a triplet network named 6MapNet that can effectively capture the movement styles of players using in-game GPS data.
Ourworks then map these heatmap pairs into feature vectors whose similarity corresponds to the actual similarity of playing styles.
arXiv Detail & Related papers (2021-09-10T07:57:12Z) - HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification [74.66819506353086]
HoVer is a dataset for many-hop evidence extraction and fact verification.
It challenges models to extract facts from several Wikipedia articles that are relevant to a claim.
Most of the 3/4-hop claims are written in multiple sentences, which adds to the complexity of understanding long-range dependency relations.
arXiv Detail & Related papers (2020-11-05T20:33:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.