Related papers: Word Embedding with Neural Probabilistic Prior

Word Embedding with Neural Probabilistic Prior

URL: http://arxiv.org/abs/2309.11824v1
Date: Thu, 21 Sep 2023 06:54:32 GMT
Title: Word Embedding with Neural Probabilistic Prior
Authors: Shaogang Ren, Dingcheng Li, Ping Li
Abstract summary: We propose a probabilistic prior which can be seamlessly integrated with word embedding models. The structure of the proposed prior is simple and effective, and it can be easily implemented and flexibly plugged in.
Score: 24.893999575628452
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To improve word representation learning, we propose a probabilistic prior which can be seamlessly integrated with word embedding models. Different from previous methods, word embedding is taken as a probabilistic generative model, and it enables us to impose a prior regularizing word representation learning. The proposed prior not only enhances the representation of embedding vectors but also improves the model's robustness and stability. The structure of the proposed prior is simple and effective, and it can be easily implemented and flexibly plugged in most existing word embedding models. Extensive experiments show the proposed method improves word representation on various tasks.

Related papers

Solvable Dynamics of Self-Supervised Word Embeddings and the Emergence of Analogical Reasoning [3.519547280344187]
We study a class of solvable contrastive self-supervised algorithms which we term quadratic word embedding models. Our solutions reveal that these models learn linear subspaces one at a time, each one incrementing the effective rank of the embeddings until model capacity is saturated. We use our dynamical theory to predict how and when models acquire the ability to complete analogies.
arXiv Detail & Related papers (2025-02-14T02:16:48Z)
Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings. We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z)
FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers [55.2480439325792]
We propose FUSE, an approach to approximating an adapter layer that maps from one model's textual embedding space to another, even across different tokenizers. We show the efficacy of our approach via multi-objective optimization over vision-language and causal language models for image captioning and sentiment-based image captioning.
arXiv Detail & Related papers (2024-08-09T02:16:37Z)
Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation [88.14365009076907]
Iterative refinement is a useful paradigm for representation learning. We develop an implicit differentiation approach that improves the stability and tractability of training.
arXiv Detail & Related papers (2022-07-02T10:00:35Z)
Probabilistic Embeddings with Laplacian Graph Priors [0.0]
We show that the model unifies several previously proposed embedding methods under one umbrella. We empirically show that our model matches the performance of previous models as special cases. We provide code as an implementation enabling flexible estimation in different settings.
arXiv Detail & Related papers (2022-03-25T13:33:51Z)
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost [5.672132510411465]
State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary words. We follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT) and makes it robust to OOV with few additional parameters.
arXiv Detail & Related papers (2022-03-15T13:11:07Z)
Obtaining Better Static Word Embeddings Using Contextual Embedding Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training. As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
Denoising Word Embeddings by Averaging in a Shared Space [34.175826109538676]
We introduce a new approach for smoothing and improving the quality of word embeddings. We project all the models to a shared vector space using an efficient implementation of the Generalized Procrustes Analysis (GPA) procedure. As the new representations are more stable and reliable, there is a noticeable improvement in rare word evaluations.
arXiv Detail & Related papers (2021-06-05T19:49:02Z)
Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance. We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images. Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z)
PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding [16.531103175919924]
We look into the task of emphgeneralizing word embeddings. given a set of pre-trained word vectors over a finite vocabulary, the goal is to predict embedding vectors for out-of-vocabulary words. We propose a model, along with an efficient algorithm, that simultaneously models subword segmentation and computes subword-based compositional word embedding.
arXiv Detail & Related papers (2020-10-21T08:11:08Z)
Multiple Word Embeddings for Increased Diversity of Representation [15.279850826041066]
We show a technique that substantially and consistently improves performance over a strong baseline with negligible increase in run time. We analyze aspects of pre-trained embedding similarity and vocabulary coverage and find that the representational diversity is the driving force of why this technique works.
arXiv Detail & Related papers (2020-09-30T02:33:09Z)
Multiplex Word Embeddings for Selectional Preference Acquisition [70.33531759861111]
We propose a multiplex word embedding model, which can be easily extended according to various relations among words. Our model can effectively distinguish words with respect to different relations without introducing unnecessary sparseness.
arXiv Detail & Related papers (2020-01-09T04:47:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.