A Simple and Plug-and-play Method for Unsupervised Sentence
Representation Enhancement
- URL: http://arxiv.org/abs/2305.07824v1
- Date: Sat, 13 May 2023 02:43:59 GMT
- Title: A Simple and Plug-and-play Method for Unsupervised Sentence
Representation Enhancement
- Authors: Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi
- Abstract summary: RepAL is an extremely simple post-processing method that enhances sentence representations.
We show that RepAL is free of training and is a plug-and-play method that can be combined with most existing unsupervised sentence learning models.
- Score: 35.6803390044542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating proper embedding of sentences through an unsupervised way is
beneficial to semantic matching and retrieval problems in real-world scenarios.
This paper presents Representation ALchemy (RepAL), an extremely simple
post-processing method that enhances sentence representations. The basic idea
in RepAL is to de-emphasize redundant information of sentence embedding
generated by pre-trained models. Through comprehensive experiments, we show
that RepAL is free of training and is a plug-and-play method that can be
combined with most existing unsupervised sentence learning models. We also
conducted in-depth analysis to understand RepAL.
Related papers
- Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation [21.54093527562344]
We propose a new strategy where the prior knowledge from large pre-trained models (LPMs) is distilled and leveraged as supervision.
Specifically, we introduce Retrieval-augmented Pseudo Sentence Generation (RaPSG), which can efficiently retrieve highly relevant short region descriptions.
Experimental results indicate that our method outperforms SOTA captioning models across various settings.
arXiv Detail & Related papers (2023-07-27T10:16:13Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Learning Non-Autoregressive Models from Search for Unsupervised Sentence
Summarization [20.87460375478907]
Text summarization aims to generate a short summary for an input text.
In this work, we propose a Non-Autoregressive Unsupervised Summarization approach.
Experiments show that NAUS achieves state-of-the-art performance for unsupervised summarization.
arXiv Detail & Related papers (2022-05-28T21:09:23Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - Predicting What You Already Know Helps: Provable Self-Supervised
Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data.
We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation.
We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z) - Embodied Self-supervised Learning by Coordinated Sampling and Training [14.107020105091662]
We propose a novel self-supervised approach to solve inverse problems by employing the corresponding physical forward process.
The proposed approach works in an analysis-by-synthesis manner to learn an inference network by iteratively sampling and training.
We prove the feasibility of the proposed method by tackling the acoustic-to-articulatory inversion problem to infer articulatory information from speech.
arXiv Detail & Related papers (2020-06-20T14:05:47Z) - Iterative Edit-Based Unsupervised Sentence Simplification [30.128553647491817]
Our model is guided by a scoring function involving fluency, simplicity, and meaning preservation.
We iteratively perform word and phrase-level edits on the complex sentence.
arXiv Detail & Related papers (2020-06-17T03:53:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.