Related papers: SetCSE: Set Operations using Contrastive Learning of Sentence Embeddings

SetCSE: Set Operations using Contrastive Learning of Sentence Embeddings

URL: http://arxiv.org/abs/2404.17606v1
Date: Thu, 25 Apr 2024 02:05:30 GMT
Title: SetCSE: Set Operations using Contrastive Learning of Sentence Embeddings
Authors: Kang Liu,
Abstract summary: SetCSE employs sets to represent complex semantics and incorporates well-defined operations for structured information querying. We introduce an inter-set contrastive learning objective to enhance comprehension of sentence embedding models concerning the given semantics. We demonstrate that SetCSE adheres to the conventions of human language expressions regarding compounded semantics.
Score: 6.988934943372354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Taking inspiration from Set Theory, we introduce SetCSE, an innovative information retrieval framework. SetCSE employs sets to represent complex semantics and incorporates well-defined operations for structured information querying under the provided context. Within this framework, we introduce an inter-set contrastive learning objective to enhance comprehension of sentence embedding models concerning the given semantics. Furthermore, we present a suite of operations, including SetCSE intersection, difference, and operation series, that leverage sentence embeddings of the enhanced model for complex sentence retrieval tasks. Throughout this paper, we demonstrate that SetCSE adheres to the conventions of human language expressions regarding compounded semantics, provides a significant enhancement in the discriminatory capability of underlying sentence embedding models, and enables numerous information retrieval tasks involving convoluted and intricate prompts which cannot be achieved using existing querying methods.

Related papers

Clarifying Ambiguities: on the Role of Ambiguity Types in Prompting Methods for Clarification Generation [5.259846811078731]
We focus on the concept of ambiguity for clarification, seeking to model and integrate ambiguities in the clarification process. We name this new prompting scheme Ambiguity Type-Chain of Thought (AT-CoT)
arXiv Detail & Related papers (2025-04-16T14:21:02Z)
ECLAIR: Enhanced Clarification for Interactive Responses [10.954831867440332]
ECLAIR generates clarification questions for ambiguous user queries and resolves ambiguity based on the user's response. We introduce a generalized architecture capable of integrating ambiguity information from multiple downstream agents. We conduct experiments comparing ECLAIR to few-shot prompting techniques and demonstrate ECLAIR's superior performance in question generation and ambiguity resolution.
arXiv Detail & Related papers (2025-03-19T23:04:00Z)
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z)
Contextualization Distillation from Large Language Model for Knowledge Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z)
Prompt-based Logical Semantics Enhancement for Implicit Discourse Relation Recognition [4.7938839332508945]
We propose a Prompt-based Logical Semantics Enhancement (PLSE) method for Implicit Discourse Relation Recognition (IDRR) Our method seamlessly injects knowledge relevant to discourse relation into pre-trained language models through prompt-based connective prediction. Experimental results on PDTB 2.0 and CoNLL16 datasets demonstrate that our method achieves outstanding and consistent performance against the current state-of-the-art models.
arXiv Detail & Related papers (2023-11-01T08:38:08Z)
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations [80.45474362071236]
It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space. We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings.
arXiv Detail & Related papers (2023-05-24T00:44:49Z)
In-Context Probing: Toward Building Robust Classifiers via Probing Large Language Models [5.5089506884366735]
In this paper, we propose an alternative approach, which we term In-Context Probing (ICP) Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label. We show that ICP performs competitive or superior to finetuning and can be particularly helpful to build classifiers on top of smaller models.
arXiv Detail & Related papers (2023-05-23T15:43:04Z)
InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings [61.77760317554826]
This paper proposes an information-d contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large.
arXiv Detail & Related papers (2022-10-08T15:53:19Z)
Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem. Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z)
Set Representation Learning with Generalized Sliced-Wasserstein Embeddings [22.845403993200932]
We propose a geometrically-interpretable framework for learning representations from set-structured data. In particular, we treat elements of a set as samples from a probability measure and propose an exact Euclidean embedding for Generalized Sliced Wasserstein. We evaluate our proposed framework on multiple supervised and unsupervised set learning tasks and demonstrate its superiority over state-of-the-art set representation learning approaches.
arXiv Detail & Related papers (2021-03-05T19:00:34Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
Conversational Semantic Parsing [50.954321571100294]
Session-based properties such as co-reference resolution and context carryover are processed downstream in a pipelined system. We release a new session-based, compositional task-oriented parsing dataset of 20k sessions consisting of 60k utterances. We propose a new family of Seq2Seq models for the session-based parsing above, which achieve better or comparable performance to the current state-of-the-art on ATIS, SNIPS, TOP and DSTC2.
arXiv Detail & Related papers (2020-09-28T22:08:00Z)
Multidirectional Associative Optimization of Function-Specific Word Representations [86.87082468226387]
We present a neural framework for learning associations between interrelated groups of words. Our model induces a joint function-specific word vector space, where vectors of e.g. plausible SVO compositions lie close together. The model retains information about word group membership even in the joint space, and can thereby effectively be applied to a number of tasks reasoning over the SVO structure.
arXiv Detail & Related papers (2020-05-11T17:07:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.