Related papers: MetricBERT: Text Representation Learning via Self-Supervised Triplet Training

MetricBERT: Text Representation Learning via Self-Supervised Triplet Training

URL: http://arxiv.org/abs/2208.06610v1
Date: Sat, 13 Aug 2022 09:52:58 GMT
Title: MetricBERT: Text Representation Learning via Self-Supervised Triplet Training
Authors: Itzik Malkiel, Dvir Ginzburg, Oren Barkan, Avi Caciularu, Yoni Weill, Noam Koenigstein
Abstract summary: MetricBERT learns to embed text under a well-defined similarity metric. We show that MetricBERT outperforms state-of-the-art alternatives, sometimes by a substantial margin.
Score: 26.66640112616559
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present MetricBERT, a BERT-based model that learns to embed text under a well-defined similarity metric while simultaneously adhering to the ``traditional'' masked-language task. We focus on downstream tasks of learning similarities for recommendations where we show that MetricBERT outperforms state-of-the-art alternatives, sometimes by a substantial margin. We conduct extensive evaluations of our method and its different variants, showing that our training objective is highly beneficial over a traditional contrastive loss, a standard cosine similarity objective, and six other baselines. As an additional contribution, we publish a dataset of video games descriptions along with a test set of similarity annotations crafted by a domain expert.

Related papers

Evaluating Spatiotemporal Consistency in Automatically Generated Sewing Instructions [51.362705361059795]
In this paper, we propose a metric for evaluating the soundness of sewing instructions.<n>We show that our proposed metric better correlates with manually-annotated error counts as well as human quality ratings.
arXiv Detail & Related papers (2025-09-29T13:46:27Z)
MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models [27.516184838635414]
This paper introduces MEBench, a novel benchmark for evaluating mutual exclusivity (ME) bias.<n>Unlike traditional ME tasks, MEBench further incorporates spatial reasoning to create more challenging and realistic evaluation settings.<n>We assess the performance of state-of-the-art vision-language models (VLMs) on this benchmark using novel evaluation metrics that capture key aspects of ME-based reasoning.
arXiv Detail & Related papers (2025-05-26T15:23:18Z)
Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching. An anchor branch is first trained to provide insights into the data properties. A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z)
AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity Using Contrastive Learning and Structured Knowledge [4.563449647618151]
We present AspectCSE, an approach for aspect-based contrastive learning of sentence embeddings. We demonstrate that multi-aspect embeddings outperform single-aspect embeddings on aspect-specific information retrieval tasks.
arXiv Detail & Related papers (2023-07-15T17:01:56Z)
Multi-Similarity Contrastive Learning [4.297070083645049]
We propose a novel multi-similarity contrastive loss (MSCon) that learns generalizable embeddings by jointly utilizing supervision from multiple metrics of similarity. Our method automatically learns contrastive similarity weightings based on the uncertainty in the corresponding similarity. We show empirically that networks trained with MSCon outperform state-of-the-art baselines on in-domain and out-of-domain settings.
arXiv Detail & Related papers (2023-07-06T01:26:01Z)
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings. Our model operates on parallel data in $N$ languages. We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z)
Emotions are Subtle: Learning Sentiment Based Text Representations Using Contrastive Learning [6.6389732792316005]
We extend the use of contrastive learning embeddings to sentiment analysis tasks. We show that fine-tuning on these embeddings provides an improvement over fine-tuning on BERT-based embeddings.
arXiv Detail & Related papers (2021-12-02T08:29:26Z)
Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text. Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z)
On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task. We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models. Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z)
RecoBERT: A Catalog Language Model for Text-Based Recommendations [32.40792615018446]
RecoBERT is a BERT-based approach for learning catalog-specialized language models for text-based item recommendations. We introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews.
arXiv Detail & Related papers (2020-09-25T14:23:38Z)
Memory-Augmented Relation Network for Few-Shot Learning [114.47866281436829]
In this work, we investigate a new metric-learning method, Memory-Augmented Relation Network (MRN) In MRN, we choose the samples that are visually similar from the working context, and perform weighted information propagation to attentively aggregate helpful information from chosen ones to enhance its representation. We empirically demonstrate that MRN yields significant improvement over its ancestor and achieves competitive or even better performance when compared with other few-shot learning approaches.
arXiv Detail & Related papers (2020-05-09T10:09:13Z)
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning [83.48587570246231]
Visual Similarity plays an important role in many computer vision applications. Deep metric learning (DML) is a powerful framework for learning such similarities. We propose and study multiple complementary learning tasks, targeting conceptually different data relationships. We learn a single model to aggregate their training signals, resulting in strong generalization and state-of-the-art performance.
arXiv Detail & Related papers (2020-04-28T12:26:50Z)
Evaluating Online Continual Learning with CALM [3.49781504808707]
Online Continual Learning studies learning over a continuous data stream without observing any single example more than once. We propose a new benchmark for OCL based on language modelling in which input alternates between different languages and domains without any explicit delimitation. We also propose new metrics to study catastrophic forgetting in this setting and evaluate multiple baseline models based on compositions of experts.
arXiv Detail & Related papers (2020-04-07T13:17:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.