Out-of-the-Box Conditional Text Embeddings from Large Language Models
- URL: http://arxiv.org/abs/2504.16411v1
- Date: Wed, 23 Apr 2025 04:27:15 GMT
- Title: Out-of-the-Box Conditional Text Embeddings from Large Language Models
- Authors: Kosuke Yamada, Peinan Zhang,
- Abstract summary: PonTE is a novel unsupervised conditional text embedding method.<n>We show that PonTE can generate useful conditional text embeddings and achieve performance comparable to supervised methods without fine-tuning.
- Score: 7.228724224384926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional text embedding is a proposed representation that captures the shift in perspective on texts when conditioned on a specific aspect. Previous methods have relied on extensive training data for fine-tuning models, leading to challenges in terms of labor and resource costs. We propose PonTE, a novel unsupervised conditional text embedding method that leverages a causal large language model and a conditional prompt. Through experiments on conditional semantic text similarity and text clustering, we demonstrate that PonTE can generate useful conditional text embeddings and achieve performance comparable to supervised methods without fine-tuning. We also show the interpretability of text embeddings with PonTE by analyzing word generation following prompts and embedding visualization.
Related papers
- TS-HTFA: Advancing Time Series Forecasting via Hierarchical Text-Free Alignment with Large Language Models [14.411646409316624]
We introduce textbfHierarchical textbfText-textbfFree textbfAlignment (textbfTS-HTFA), a novel method for time-series forecasting.<n>We replace paired text data with adaptive virtual text based on QR decomposition word embeddings and learnable prompt.<n>Experiments on multiple time-series benchmarks demonstrate that HTFA achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-09-23T12:57:24Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Towards Unified Scene Text Spotting based on Sequence Generation [4.437335677401287]
We propose a UNIfied scene Text Spotter, called UNITS.
Our model unifies various detection formats, including quadrilaterals and polygons.
We apply starting-point prompting to enable the model to extract texts from an arbitrary starting point.
arXiv Detail & Related papers (2023-04-07T01:28:08Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Language modeling via stochastic processes [30.796382023812022]
Modern language models can generate high-quality short texts, but often meander or are incoherent when generating longer texts.
Recent work in self-supervised learning suggests that models can learn good latent representations via contrastive learning.
We propose one approach for leveraging constrastive representations, which we call Time Control.
arXiv Detail & Related papers (2022-03-21T22:13:53Z) - CORE-Text: Improving Scene Text Detection with Contrastive Relational
Reasoning [65.57338873921168]
Localizing text instances in natural scenes is regarded as a fundamental challenge in computer vision.
In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module.
We integrate the CORE module into a two-stage text detector of Mask R-CNN and devise our text detector CORE-Text.
arXiv Detail & Related papers (2021-12-14T16:22:25Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Improving Disentangled Text Representation Learning with
Information-Theoretic Guidance [99.68851329919858]
discrete nature of natural language makes disentangling of textual representations more challenging.
Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text.
Experiments on both conditional text generation and text-style transfer demonstrate the high quality of our disentangled representation.
arXiv Detail & Related papers (2020-06-01T03:36:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.