Related papers: An Unsupervised Sentence Embedding Method by Mutual Information Maximization

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

URL: http://arxiv.org/abs/2009.12061v2
Date: Fri, 5 Feb 2021 03:15:25 GMT
Title: An Unsupervised Sentence Embedding Method by Mutual Information Maximization
Authors: Yan Zhang, Ruidan He, Zuozhu Liu, Kwan Hui Lim, Lidong Bing
Abstract summary: Sentence BERT (SBERT) is inefficient for sentence-pair tasks such as clustering or semantic search. We propose a lightweight extension on top of BERT and a novel self-supervised learning objective. Our method is not restricted by the availability of labeled data, such as it can be applied on different domain-specific corpus.
Score: 34.947950543830686
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: BERT is inefficient for sentence-pair tasks such as clustering or semantic search as it needs to evaluate combinatorially many sentence pairs which is very time-consuming. Sentence BERT (SBERT) attempted to solve this challenge by learning semantically meaningful representations of single sentences, such that similarity comparison can be easily accessed. However, SBERT is trained on corpus with high-quality labeled sentence pairs, which limits its application to tasks where labeled data is extremely scarce. In this paper, we propose a lightweight extension on top of BERT and a novel self-supervised learning objective based on mutual information maximization strategies to derive meaningful sentence embeddings in an unsupervised manner. Unlike SBERT, our method is not restricted by the availability of labeled data, such that it can be applied on different domain-specific corpus. Experimental results show that the proposed method significantly outperforms other unsupervised sentence embedding baselines on common semantic textual similarity (STS) tasks and downstream supervised tasks. It also outperforms SBERT in a setting where in-domain labeled data is not available, and achieves performance competitive with supervised methods on various tasks.

Related papers

Revisiting Sparse Retrieval for Few-shot Entity Linking [33.15662306409253]
We propose an ELECTRA-based keyword extractor to denoise the mention context and construct a better query expression. For training the extractor, we propose a distant supervision method to automatically generate training data based on overlapping tokens between mention contexts and entity descriptions. Experimental results on the ZESHEL dataset demonstrate that the proposed method outperforms state-of-the-art models by a significant margin across all test domains.
arXiv Detail & Related papers (2023-10-19T03:51:10Z)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning) It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario. Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings. RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z)
PromptBERT: Improving BERT Sentence Embeddings with Prompts [95.45347849834765]
We propose a prompt based sentence embeddings method which can reduce token embeddings biases and make the original BERT layers more effective. We also propose a novel unsupervised training objective by the technology of template denoising, which substantially shortens the performance gap between the supervised and unsupervised setting. Our fine-tuned method outperforms the state-of-the-art method SimCSE in both unsupervised and supervised settings.
arXiv Detail & Related papers (2022-01-12T06:54:21Z)
MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction [41.941098507759015]
Keyphrases are phrases in a document providing a concise summary of core content, helping readers to understand what the article is talking about in a minute. We propose a novel unsupervised keyword extraction method by leveraging the BERT-based model to select and rank candidate keyphrases with a MASK strategy.
arXiv Detail & Related papers (2021-10-13T11:29:17Z)
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer [19.643512923368743]
We present ConSERT, a Contrastive Framework for Self-Supervised Sentence Representation Transfer. By making use of unlabeled texts, ConSERT solves the collapse issue of BERT-derived sentence representations. Experiments on STS datasets demonstrate that ConSERT achieves an 8% relative improvement over the previous state-of-the-art.
arXiv Detail & Related papers (2021-05-25T08:15:01Z)
WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD) An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images. The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning [53.32740707197856]
We present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE) It can achieve up to 93.1% of the performance of in-domain supervised approaches.
arXiv Detail & Related papers (2021-04-14T17:02:18Z)
On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z)
Exploring Cross-sentence Contexts for Named Entity Recognition with BERT [1.4998865865537996]
We present a study exploring the use of cross-sentence information for NER using BERT models in five languages. We find that adding context in the form of additional sentences to BERT input increases NER performance on all of the tested languages and models. We propose a straightforward method, Contextual Majority Voting (CMV), to combine different predictions for sentences and demonstrate this to further increase NER performance with BERT.
arXiv Detail & Related papers (2020-06-02T12:34:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.