Attention-based Contrastive Learning for Winograd Schemas
- URL: http://arxiv.org/abs/2109.05108v1
- Date: Fri, 10 Sep 2021 21:10:22 GMT
- Title: Attention-based Contrastive Learning for Winograd Schemas
- Authors: Tassilo Klein and Moin Nabi
- Abstract summary: This paper investigates whether contrastive learning can be extended to Transfomer attention to tackle the Winograd Challenge.
We propose a novel self-supervised framework, leveraging a contrastive loss directly at the level of self-attention.
- Score: 27.11678023496321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning has recently attracted considerable attention in the
NLP community for its ability to learn discriminative features using a
contrastive objective. This paper investigates whether contrastive learning can
be extended to Transfomer attention to tackling the Winograd Schema Challenge.
To this end, we propose a novel self-supervised framework, leveraging a
contrastive loss directly at the level of self-attention. Experimental analysis
of our attention-based models on multiple datasets demonstrates superior
commonsense reasoning capabilities. The proposed approach outperforms all
comparable unsupervised approaches while occasionally surpassing supervised
ones.
Related papers
- Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning [25.514007761856632]
graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity.
We argue that these methods struggle to balance between semantic invariance and view hardness across the dynamic training process.
We propose a novel GCL-based recommendation framework RGCL, which effectively maintains the semantic invariance of contrastive pairs and dynamically adapts as the model capability evolves.
arXiv Detail & Related papers (2024-07-14T13:03:35Z) - Semi-supervised learning made simple with self-supervised clustering [65.98152950607707]
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations.
We propose a conceptually simple yet empirically powerful approach to turn clustering-based self-supervised methods into semi-supervised learners.
arXiv Detail & Related papers (2023-06-13T01:09:18Z) - Self-Regulated Learning for Egocentric Video Activity Anticipation [147.9783215348252]
Self-Regulated Learning (SRL) aims to regulate the intermediate representation consecutively to produce representation that emphasizes the novel information in the frame of the current time-stamp.
SRL sharply outperforms existing state-of-the-art in most cases on two egocentric video datasets and two third-person video datasets.
arXiv Detail & Related papers (2021-11-23T03:29:18Z) - Recent Advancements in Self-Supervised Paradigms for Visual Feature
Representation [0.41436032949434404]
Supervised learning requires a large amount of labeled data to reach state-of-the-art performance.
To avoid the cost of labeling data, self-supervised methods were proposed to make use of largely available unlabeled data.
This study conducts a comprehensive and insightful survey and analysis of recent developments in the self-supervised paradigm for feature representation.
arXiv Detail & Related papers (2021-11-03T07:02:34Z) - Co$^2$L: Contrastive Continual Learning [69.46643497220586]
Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks.
We propose a rehearsal-based continual learning algorithm that focuses on continually learning and maintaining transferable representations.
arXiv Detail & Related papers (2021-06-28T06:14:38Z) - Disambiguation of weak supervision with exponential convergence rates [88.99819200562784]
In supervised learning, data are annotated with incomplete yet discriminative information.
In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets.
We propose an empirical disambiguation algorithm to recover full supervision from weak supervision.
arXiv Detail & Related papers (2021-02-04T18:14:32Z) - Can Semantic Labels Assist Self-Supervised Visual Representation
Learning? [194.1681088693248]
We present a new algorithm named Supervised Contrastive Adjustment in Neighborhood (SCAN)
In a series of downstream tasks, SCAN achieves superior performance compared to previous fully-supervised and self-supervised methods.
Our study reveals that semantic labels are useful in assisting self-supervised methods, opening a new direction for the community.
arXiv Detail & Related papers (2020-11-17T13:25:00Z) - Self-supervised Learning from a Multi-view Perspective [121.63655399591681]
We show that self-supervised representations can extract task-relevant information and discard task-irrelevant information.
Our theoretical framework paves the way to a larger space of self-supervised learning objective design.
arXiv Detail & Related papers (2020-06-10T00:21:35Z) - Contrastive Self-Supervised Learning for Commonsense Reasoning [26.68818542540867]
We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Challenge problems.
Our approach exploits the characteristic structure of training corpora related to so-called "trigger" words.
arXiv Detail & Related papers (2020-05-02T00:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.