WikiContradiction: Detecting Self-Contradiction Articles on Wikipedia
- URL: http://arxiv.org/abs/2111.08543v1
- Date: Tue, 16 Nov 2021 15:12:37 GMT
- Title: WikiContradiction: Detecting Self-Contradiction Articles on Wikipedia
- Authors: Cheng Hsu, Cheng-Te Li, Diego Saez-Trumper, Yi-Zhan Hsu
- Abstract summary: We propose a task of detecting self-contradiction articles in Wikipedia.
Based on the "self-contradictory" template, we create a novel dataset for the self-contradiction detection task.
We present the first model, Pairwise Contradiction Neural Network (PCNN), to not only effectively identify self-contradiction articles, but also highlight the most contradiction pairs of contradiction sentences.
- Score: 8.755487474723994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Wikipedia has been utilized for fact-checking and claim verification to
debunk misinformation and disinformation, it is essential to either improve
article quality and rule out noisy articles. Self-contradiction is one of the
low-quality article types in Wikipedia. In this work, we propose a task of
detecting self-contradiction articles in Wikipedia. Based on the
"self-contradictory" template, we create a novel dataset for the
self-contradiction detection task. Conventional contradiction detection focuses
on comparing pairs of sentences or claims, but self-contradiction detection
needs to further reason the semantics of an article and simultaneously learn
the contradiction-aware comparison from all pairs of sentences. Therefore, we
present the first model, Pairwise Contradiction Neural Network (PCNN), to not
only effectively identify self-contradiction articles, but also highlight the
most contradiction pairs of contradiction sentences. The main idea of PCNN is
two-fold. First, to mitigate the effect of data scarcity on self-contradiction
articles, we pre-train the module of pairwise contradiction learning using SNLI
and MNLI benchmarks. Second, we select top-K sentence pairs with the highest
contradiction probability values and model their correlation to determine
whether the corresponding article belongs to self-contradiction. Experiments
conducted on the proposed WikiContradiction dataset exhibit that PCNN can
generate promising performance and comprehensively highlight the sentence pairs
the contradiction locates.
Related papers
- SparseCL: Sparse Contrastive Learning for Contradiction Retrieval [87.02936971689817]
Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query.
Existing methods such as similarity search and crossencoder models exhibit significant limitations.
We introduce SparseCL that leverages specially trained sentence embeddings designed to preserve subtle, contradictory nuances between sentences.
arXiv Detail & Related papers (2024-06-15T21:57:03Z) - Generating Prototypes for Contradiction Detection Using Large Language
Models and Linguistic Rules [1.6497679785422956]
We introduce a novel data generation method for contradiction detection.
We instruct the generative models to create contradicting statements with respect to descriptions of specific contradiction types.
As an auxiliary approach, we use linguistic rules to construct simple contradictions.
arXiv Detail & Related papers (2023-10-23T09:07:27Z) - Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation [5.043563227694139]
Large language models (large LMs) are susceptible to producing text that contains hallucinated content.
We present a comprehensive investigation into self-contradiction for various instruction-tuned LMs.
We propose a novel prompting-based framework designed to effectively detect and mitigate self-contradictions.
arXiv Detail & Related papers (2023-05-25T08:43:46Z) - Generate, Discriminate and Contrast: A Semi-Supervised Sentence
Representation Learning Framework [68.04940365847543]
We propose a semi-supervised sentence embedding framework, GenSE, that effectively leverages large-scale unlabeled data.
Our method include three parts: 1) Generate: A generator/discriminator model is jointly trained to synthesize sentence pairs from open-domain unlabeled corpus; 2) Discriminate: Noisy sentence pairs are filtered out by the discriminator to acquire high-quality positive and negative sentence pairs; 3) Contrast: A prompt-based contrastive approach is presented for sentence representation learning with both annotated and synthesized data.
arXiv Detail & Related papers (2022-10-30T10:15:21Z) - CDConv: A Benchmark for Contradiction Detection in Chinese Conversations [74.78715797366395]
We propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv.
It contains 12K multi-turn conversations annotated with three typical contradiction categories: Intra-sentence Contradiction, Role Confusion, and History Contradiction.
arXiv Detail & Related papers (2022-10-16T11:37:09Z) - Improving Bot Response Contradiction Detection via Utterance Rewriting [45.55560596440624]
This work aims to improve the contradiction detection via rewriting all bot utterances to restore antecedents and ellipsis.
We empirically demonstrate that this model can produce satisfactory rewrites to make bot utterances more complete.
Using rewritten utterances improves contradiction detection performance significantly, e.g., the AUPR and joint accuracy scores (detecting contradiction along with evidence) increase by 6.5% and 4.5%, respectively.
arXiv Detail & Related papers (2022-07-25T00:54:30Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework
Unifying Hybrid Granularities for Text Generation [59.01297461453444]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.