Related papers: Learning High-Quality and General-Purpose Phrase Representations

Learning High-Quality and General-Purpose Phrase Representations

URL: http://arxiv.org/abs/2401.10407v2
Date: Thu, 22 Feb 2024 13:46:56 GMT
Title: Learning High-Quality and General-Purpose Phrase Representations
Authors: Lihu Chen and Ga\"el Varoquaux and Fabian M. Suchanek
Abstract summary: Phrase representations play an important role in data science and natural language processing. Current state-of-the-art method involves fine-tuning pre-trained language models for phrasal embeddings. We propose an improved framework to learn phrase representations in a context-free fashion.
Score: 9.246374019271938
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Phrase representations play an important role in data science and natural language processing, benefiting various tasks like Entity Alignment, Record Linkage, Fuzzy Joins, and Paraphrase Classification. The current state-of-the-art method involves fine-tuning pre-trained language models for phrasal embeddings using contrastive learning. However, we have identified areas for improvement. First, these pre-trained models tend to be unnecessarily complex and require to be pre-trained on a corpus with context sentences. Second, leveraging the phrase type and morphology gives phrase representations that are both more precise and more flexible. We propose an improved framework to learn phrase representations in a context-free fashion. The framework employs phrase type classification as an auxiliary task and incorporates character-level information more effectively into the phrase representation. Furthermore, we design three granularities of data augmentation to increase the diversity of training samples. Our experiments across a wide range of tasks show that our approach generates superior phrase embeddings compared to previous methods while requiring a smaller model size. [PEARL-small]: https://huggingface.co/Lihuchen/pearl_small; [PEARL-base]: https://huggingface.co/Lihuchen/pearl_base; [Code and Dataset]: https://github.com/tigerchen52/PEARL

Related papers

Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Make Prompts Adaptable: Bayesian Modeling for Vision-Language Prompt Learning with Data-Dependent Prior [14.232144691524528]
Recent Vision-Language Pretrained models have become the backbone for many downstream tasks. MLE training can lead the context vector to over-fit dominant image features in the training data. This paper presents a Bayesian-based framework of prompt learning, which could alleviate the overfitting issues on few-shot learning application.
arXiv Detail & Related papers (2024-01-09T10:15:59Z)
EXnet: Efficient In-context Learning for Data-less Text classification [0.0]
We present EXnet, a model specifically designed to perform in-context learning without limitations on the number of examples. We argue that in-context learning is an effective method to increase task accuracy, and providing examples facilitates cross-task generalization. With extensive experiments, we show that even our smallest model (15M parameters) generalizes to several unseen classification tasks and domains.
arXiv Detail & Related papers (2023-05-24T01:40:57Z)
Efficient and Flexible Topic Modeling using Pretrained Embeddings and Bag of Sentences [1.8592384822257952]
We propose a novel topic modeling and inference algorithm. We leverage pre-trained sentence embeddings by combining generative process models and clustering. TheTailor evaluation shows that our method yields state-of-the art results with relatively little computational demands.
arXiv Detail & Related papers (2023-02-06T20:13:11Z)
Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size. We propose a fully compositional output embedding layer for language models. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training. We propose a new pre-training task based on contrastive learning. By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z)
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations [4.36561468436181]
We present DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations. Our approach closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders. Our code and pretrained models are publicly available and can be easily adapted to new domains or used to embed unseen text.
arXiv Detail & Related papers (2020-06-05T20:00:28Z)
A Simple Joint Model for Improved Contextual Neural Lemmatization [60.802451210656805]
We present a simple joint neural model for lemmatization and morphological tagging that achieves state-of-the-art results on 20 languages. Our paper describes the model in addition to training and decoding procedures.
arXiv Detail & Related papers (2019-04-04T02:03:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.