Related papers: Representing ELMo embeddings as two-dimensional text online

Representing ELMo embeddings as two-dimensional text online

URL: http://arxiv.org/abs/2103.16414v1
Date: Tue, 30 Mar 2021 15:12:29 GMT
Title: Representing ELMo embeddings as two-dimensional text online
Authors: Andrey Kutuzov and Elizaveta Kuzmenko
Abstract summary: We describe a new addition to the Web embeddings toolkit which is used to serve word embedding models over the Web. The new ELMoViz module adds support for contextualized embedding architectures, in particular for ELMo models. The provided visualizations follow the metaphor of two-dimensional text' by showing lexical substitutes: words which are most semantically similar in context to the words of the input sentence.
Score: 5.1525653500591995
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We describe a new addition to the WebVectors toolkit which is used to serve word embedding models over the Web. The new ELMoViz module adds support for contextualized embedding architectures, in particular for ELMo models. The provided visualizations follow the metaphor of `two-dimensional text' by showing lexical substitutes: words which are most semantically similar in context to the words of the input sentence. The system allows the user to change the ELMo layers from which token embeddings are inferred. It also conveys corpus information about the query words and their lexical substitutes (namely their frequency tiers and parts of speech). The module is well integrated into the rest of the WebVectors toolkit, providing lexical hyperlinks to word representations in static embedding models. Two web services have already implemented the new functionality with pre-trained ELMo models for Russian, Norwegian and English.

Related papers

Large Concept Models: Language Modeling in a Sentence Representation Space [62.73366944266477]
We present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a concept. Concepts are language- and modality-agnostic and represent a higher level idea or action in a flow. We show that our model exhibits impressive zero-shot generalization performance to many languages.
arXiv Detail & Related papers (2024-12-11T23:36:20Z)
OVMR: Open-Vocabulary Recognition with Multi-Modal References [96.21248144937627]
Existing works have proposed different methods to embed category cues into the model, eg, through few-shot fine-tuning. This paper tackles open-vocabulary recognition from a different perspective by referring to multi-modal clues composed of textual descriptions and exemplar images. The proposed OVMR is a plug-and-play module, and works well with exemplar images randomly crawled from the Internet.
arXiv Detail & Related papers (2024-06-07T06:45:28Z)
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following [59.997857926808116]
We introduce a semantic panel as the decoding in texts to images. The panel is obtained through arranging the visual concepts parsed from the input text. We develop a practical system and showcase its potential in continuous generation and chatting-based editing.
arXiv Detail & Related papers (2023-11-28T17:57:44Z)
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering [118.30923824681642]
TextDiffuser-2 aims to unleash the power of language models for text rendering. We utilize the language model within the diffusion model to encode the position and texts at the line level. We conduct extensive experiments and incorporate user studies involving human participants as well as GPT-4V.
arXiv Detail & Related papers (2023-11-28T04:02:40Z)
Vec2Gloss: definition modeling leveraging contextualized vectors with Wordnet gloss [8.741676279851728]
We propose a Vec2Gloss' model, which produces the gloss from the target word's contextualized embeddings. The generated glosses of this study are made possible by the systematic gloss patterns provided by Chinese Wordnet. Our results indicate that the proposed Vec2Gloss' model opens a new perspective to the lexical-semantic applications of contextualized embeddings.
arXiv Detail & Related papers (2023-05-29T02:37:37Z)
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification [108.83932812826521]
Large Language Models (LLM) trained on web-scale text show impressive abilities to repurpose their learned knowledge for a multitude of tasks. Our proposed model, I2MVFormer, learns multi-view semantic embeddings for zero-shot image classification with these class views. I2MVFormer establishes a new state-of-the-art on three public benchmark datasets for zero-shot image classification with unsupervised semantic embeddings.
arXiv Detail & Related papers (2022-12-05T14:11:36Z)
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning [80.59607794927363]
We propose a novel image captioner: learning to Collocate Visual-Linguistic Neural Modules (LNCVM) Unlike the rewidely used neural module networks in VQA, the task of collocating visual-linguistic modules is more challenging. Our CVLNM is more effective,. achieving a new state-of-the-art 129.5 CIDEr-D, and more robust. Experiments on the MS-COCO dataset show that our CVLNM is more effective,. achieving a new state-of-the-art 129.5 CIDEr
arXiv Detail & Related papers (2022-10-04T03:09:50Z)
Knowing Where and What: Unified Word Block Pretraining for Document Understanding [11.46378901674016]
We propose UTel, a language model with Unified TExt and layout pre-training. Specifically, we propose two pre-training tasks: Surrounding Word Prediction (SWP) for the layout learning, and Contrastive learning of Word Embeddings (CWE) for identifying different word blocks. In this way, the joint training of Masked Layout-Language Modeling (MLLM) and two newly proposed tasks enables the interaction between semantic and spatial features in a unified way.
arXiv Detail & Related papers (2022-07-28T09:43:06Z)
Using Word Embeddings to Analyze Protests News [2.024222101808971]
Two well performing models have been chosen in order to replace the existing word embeddings word2vec and FastTest with ELMo and DistilBERT. Unlike bag of words or earlier vector approaches, ELMo and DistilBERT represent words as a sequence of vectors by capturing the meaning based on contextual information in the text.
arXiv Detail & Related papers (2022-03-11T12:25:59Z)
Modelling the semantics of text in complex document layouts using graph transformer networks [0.0]
We propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information.
arXiv Detail & Related papers (2022-02-18T11:49:06Z)
Video-Text Pre-training with Learned Regions [59.30893505895156]
Video-Text pre-training aims at learning transferable representations from large-scale video-text pairs. We propose a module for videotext-learning, RegionLearner, which can take into account the structure of objects during pre-training on large-scale video-text pairs.
arXiv Detail & Related papers (2021-12-02T13:06:53Z)
Augmenting semantic lexicons using word embeddings and transfer learning [1.101002667958165]
We propose two models for predicting sentiment scores to augment semantic lexicons at a relatively low cost using word embeddings and transfer learning. Our evaluation shows both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.
arXiv Detail & Related papers (2021-09-18T20:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.