SimCLS: A Simple Framework for Contrastive Learning of Abstractive
Summarization
- URL: http://arxiv.org/abs/2106.01890v1
- Date: Thu, 3 Jun 2021 14:34:17 GMT
- Title: SimCLS: A Simple Framework for Contrastive Learning of Abstractive
Summarization
- Authors: Yixin Liu, Pengfei Liu
- Abstract summary: We present a conceptually simple while empirically powerful framework for abstractive summarization Sim.
With minor modification over existing top-scoring systems, Sim can improve the performance of existing top-performing systems by a large margin.
Results of our proposed models have been deployed into ExplainaBoard platform, which allows researchers to understand our systems in a more fine-grained way.
- Score: 14.16710715347118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a conceptually simple while empirically powerful
framework for abstractive summarization, SimCLS, which can bridge the gap
between the learning objective and evaluation metrics resulting from the
currently dominated sequence-to-sequence learning framework by formulating text
generation as a reference-free evaluation problem (i.e., quality estimation)
assisted by contrastive learning. Experimental results show that, with minor
modification over existing top-scoring systems, SimCLS can improve the
performance of existing top-performing models by a large margin. Particularly,
2.51 absolute improvement against BART and 2.50 over PEGASUS w.r.t ROUGE-1 on
the CNN/DailyMail dataset, driving the state-of-the-art performance to a new
level. We have open-sourced our codes and results:
https://github.com/yixinL7/SimCLS. Results of our proposed models have been
deployed into ExplainaBoard platform, which allows researchers to understand
our systems in a more fine-grained way.
Related papers
- Making Text Embedders Few-Shot Learners [33.50993377494602]
We introduce a novel model bge-en-icl, which employs few-shot examples to produce high-quality text embeddings.
Our approach integrates task-related examples directly into the query side, resulting in significant improvements across various tasks.
Experimental results on the MTEB and AIR-Bench benchmarks demonstrate that our approach sets new state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2024-09-24T03:30:19Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Learning from Mistakes: Self-Regularizing Hierarchical Representations
in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding.
We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model.
Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - Self-Distilled Self-Supervised Representation Learning [35.60243157730165]
State-of-the-art frameworks in self-supervised learning have recently shown that fully utilizing transformer-based models can lead to performance boost.
In our work, we further exploit this by allowing the intermediate representations to learn from the final layers via the contrastive loss.
Our method, Self-Distilled Self-Supervised Learning (SDSSL), outperforms competitive baselines (SimCLR, BYOL and MoCo v3) using ViT on various tasks and datasets.
arXiv Detail & Related papers (2021-11-25T07:52:36Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval [11.38022203865326]
SPLADE model provides highly sparse representations and competitive results with respect to state-of-the-art dense and sparse approaches.
We modify the pooling mechanism, benchmark a model solely based on document expansion, and introduce models trained with distillation.
Overall, SPLADE is considerably improved with more than $9$% gains on NDCG@10 on TREC DL 2019, leading to state-of-the-art results on the BEIR benchmark.
arXiv Detail & Related papers (2021-09-21T10:43:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.