COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense
Question Answering
- URL: http://arxiv.org/abs/2011.00777v1
- Date: Mon, 2 Nov 2020 07:08:19 GMT
- Title: COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense
Question Answering
- Authors: Farhad Moghimifar, Lizhen Qu, Yue Zhuo, Mahsa Baktashmotlagh,
Gholamreza Haffari
- Abstract summary: Identification of the implicit causes and effects of a social context is the driving capability which can enable machines to perform commonsense reasoning.
Current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation.
We present Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the capabilities of dynamic and diverse content generation.
- Score: 50.65816570279115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Commonsense reasoning refers to the ability of evaluating a social situation
and acting accordingly. Identification of the implicit causes and effects of a
social context is the driving capability which can enable machines to perform
commonsense reasoning. The dynamic world of social interactions requires
context-dependent on-demand systems to infer such underlying information.
However, current approaches in this realm lack the ability to perform
commonsense reasoning upon facing an unseen situation, mostly due to
incapability of identifying a diverse range of implicit social relations. Hence
they fail to estimate the correct reasoning path. In this paper, we present
Conditional SEQ2SEQ-based Mixture model (COSMO), which provides us with the
capabilities of dynamic and diverse content generation. We use COSMO to
generate context-dependent clauses, which form a dynamic Knowledge Graph (KG)
on-the-fly for commonsense reasoning. To show the adaptability of our model to
context-dependant knowledge generation, we address the task of zero-shot
commonsense question answering. The empirical results indicate an improvement
of up to +5.2% over the state-of-the-art models.
Related papers
- Systems with Switching Causal Relations: A Meta-Causal Perspective [18.752058058199847]
flexibility of agents' actions or tipping points in the environmental process can change the qualitative dynamics of the system.
New causal relationships may emerge, while existing ones change or disappear, resulting in an altered causal graph.
We propose the concept of meta-causal states, which groups classical causal models into clusters based on equivalent qualitative behavior.
arXiv Detail & Related papers (2024-10-16T21:32:31Z) - Synthetic Context Generation for Question Generation [6.226609932118123]
This paper investigates training QG models using synthetic contexts generated by large language models.
We find that contexts are essential for QG tasks, even if they are synthetic.
arXiv Detail & Related papers (2024-06-19T03:37:52Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - DeSIQ: Towards an Unbiased, Challenging Benchmark for Social
Intelligence Understanding [60.84356161106069]
We study the soundness of Social-IQ, a dataset of multiple-choice questions on videos of complex social interactions.
Our analysis reveals that Social-IQ contains substantial biases, which can be exploited by a moderately strong language model.
We introduce DeSIQ, a new challenging dataset, constructed by applying simple perturbations to Social-IQ.
arXiv Detail & Related papers (2023-10-24T06:21:34Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Context De-confounded Emotion Recognition [12.037240778629346]
Context-Aware Emotion Recognition (CAER) aims to perceive the emotional states of the target person with contextual information.
A long-overlooked issue is that a context bias in existing datasets leads to a significantly unbalanced distribution of emotional states.
This paper provides a causality-based perspective to disentangle the models from the impact of such bias, and formulate the causalities among variables in the CAER task.
arXiv Detail & Related papers (2023-03-21T15:12:20Z) - elBERto: Self-supervised Commonsense Learning for Question Answering [131.51059870970616]
We propose a Self-supervised Bidirectional Representation Learning of Commonsense framework, which is compatible with off-the-shelf QA model architectures.
The framework comprises five self-supervised tasks to force the model to fully exploit the additional training signals from contexts containing rich commonsense.
elBERto achieves substantial improvements on out-of-paragraph and no-effect questions where simple lexical similarity comparison does not help.
arXiv Detail & Related papers (2022-03-17T16:23:45Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z) - Learning Opinion Dynamics From Social Traces [25.161493874783584]
We propose an inference mechanism for fitting a generative, agent-like model of opinion dynamics to real-world social traces.
We showcase our proposal by translating a classical agent-based model of opinion dynamics into its generative counterpart.
We apply our model to real-world data from Reddit to explore the long-standing question about the impact of backfire effect.
arXiv Detail & Related papers (2020-06-02T14:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.