Related papers: Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations

Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations

URL: http://arxiv.org/abs/2104.10833v1
Date: Thu, 22 Apr 2021 02:44:49 GMT
Title: Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations
Authors: Geetanjali Bihani and Julia Taylor Rayz
Abstract summary: We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations. We propose LASeR, a 'Low Anisotropy Sense Retrofitting' approach that renders off-the-shelf representations isotropic and semantically more meaningful.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contextual word representation models have shown massive improvements on a multitude of NLP tasks, yet their word sense disambiguation capabilities remain poorly explained. To address this gap, we assess whether contextual word representations extracted from deep pretrained language models create distinguishable representations for different senses of a given word. We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations, pointing towards the existence of representation degeneration problem in contextual word representations. After accounting for anisotropy, our study further reveals that there is variability in sense learning capabilities across different language models. Finally, we propose LASeR, a 'Low Anisotropy Sense Retrofitting' approach that renders off-the-shelf representations isotropic and semantically more meaningful, resolving the representation degeneration problem as a post-processing step, and conducting sense-enrichment of contextualized representations extracted from deep neural language models.

Related papers

Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation [6.870138108382051]
We introduce a novel paradigm leveraging multimodal large language models (LLMs) as proxies to extract semantic information from naturalistic images. LLMs-derived representations successfully predict established neural activity patterns measured by fMRI. A brain semantic network constructed from LLM-derived representations identifies meaningful clusters reflecting functional and contextual associations.
arXiv Detail & Related papers (2025-02-26T00:40:28Z)
Neurosymbolic Graph Enrichment for Grounded World Models [47.92947508449361]
We present a novel approach to enhance and exploit LLM reactive capability to address complex problems. We create a multimodal, knowledge-augmented formal representation of meaning that combines the strengths of large language models with structured semantic representations. By bridging the gap between unstructured language models and formal semantic structures, our method opens new avenues for tackling intricate problems in natural language understanding and reasoning.
arXiv Detail & Related papers (2024-11-19T17:23:55Z)
Investigating Idiomaticity in Word Representations [9.208145117062339]
We focus on noun compounds of varying levels of idiomaticity in two languages (English and Portuguese) We present a dataset of minimal pairs containing human idiomaticity judgments for each noun compound at both type and token levels. We define a set of fine-grained metrics of Affinity and Scaled Similarity to determine how sensitive the models are to perturbations that may lead to changes in idiomaticity.
arXiv Detail & Related papers (2024-11-04T21:05:01Z)
Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces. We directly leverage natural language prompts and image captions to map latent directions. Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z)
How well do distributed representations convey contextual lexical semantics: a Thesis Proposal [3.3585951129432323]
In this thesis, we examine the efficacy of distributed representations from modern neural networks in encoding lexical meaning. We identify four sources of ambiguity based on the relatedness and similarity of meanings influenced by context. We then aim to evaluate these sources by collecting or constructing multilingual datasets, leveraging various language models, and employing linguistic analysis tools.
arXiv Detail & Related papers (2024-06-02T14:08:51Z)
Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse. It remains an open question to what extent modern language models can interpret nonliteral phrases. We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z)
A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes. We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)
Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance. We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images. Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
Analysing Lexical Semantic Change with Contextualised Word Representations [7.071298726856781]
We propose a novel method that exploits the BERT neural language model to obtain representations of word usages. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements.
arXiv Detail & Related papers (2020-04-29T12:18:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.