Latent Topology Induction for Understanding Contextualized
Representations
- URL: http://arxiv.org/abs/2206.01512v1
- Date: Fri, 3 Jun 2022 11:22:48 GMT
- Title: Latent Topology Induction for Understanding Contextualized
Representations
- Authors: Yao Fu and Mirella Lapata
- Abstract summary: We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models.
We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
- Score: 84.7918739062235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study the representation space of contextualized embeddings
and gain insight into the hidden topology of large language models. We show
there exists a network of latent states that summarize linguistic properties of
contextualized representations. Instead of seeking alignments to existing
well-defined annotations, we infer this latent network in a fully unsupervised
way using a structured variational autoencoder. The induced states not only
serve as anchors that mark the topology (neighbors and connectivity) of the
representation manifold but also reveal the internal mechanism of encoding
sentences. With the induced network, we: (1). decompose the representation
space into a spectrum of latent states which encode fine-grained word meanings
with lexical, morphological, syntactic and semantic information; (2). show
state-state transitions encode rich phrase constructions and serve as the
backbones of the latent space. Putting the two together, we show that sentences
are represented as a traversal over the latent network where state-state
transition chains encode syntactic templates and state-word emissions fill in
the content. We demonstrate these insights with extensive experiments and
visualizations.
Related papers
- Disentangling Dense Embeddings with Sparse Autoencoders [0.0]
Sparse autoencoders (SAEs) have shown promise in extracting interpretable features from complex neural networks.
We present one of the first applications of SAEs to dense text embeddings from large language models.
We show that the resulting sparse representations maintain semantic fidelity while offering interpretability.
arXiv Detail & Related papers (2024-08-01T15:46:22Z) - Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Unifying Latent and Lexicon Representations for Effective Video-Text
Retrieval [87.69394953339238]
We propose the UNIFY framework, which learns lexicon representations to capture fine-grained semantics in video-text retrieval.
We show our framework largely outperforms previous video-text retrieval methods, with 4.8% and 8.2% Recall@1 improvement on MSR-VTT and DiDeMo respectively.
arXiv Detail & Related papers (2024-02-26T17:36:50Z) - TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo)
We first present a flexible heterogeneous semantic network that incorporates high-quality entities.
We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Semantic Representation and Inference for NLP [2.969705152497174]
This thesis investigates the use of deep learning for novel semantic representation and inference.
We contribute the largest publicly available dataset of real-life factual claims for the purpose of automatic claim verification.
We operationalize the compositionality of a phrase contextually by enriching the phrase representation with external word embeddings and knowledge graphs.
arXiv Detail & Related papers (2021-06-15T13:22:48Z) - A Self-supervised Representation Learning of Sentence Structure for
Authorship Attribution [3.5991811164452923]
We propose a self-supervised framework for learning structural representations of sentences.
We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task.
arXiv Detail & Related papers (2020-10-14T02:57:10Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Semantic Holism and Word Representations in Artificial Neural Networks [0.0]
We show that word representations from the Skip-gram variant of the word2vec model exhibit interesting semantic properties.
This is usually explained by referring to the general distributional hypothesis.
We propose a more specific approach based on Frege's holistic and functional approach to meaning.
arXiv Detail & Related papers (2020-03-11T21:04:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.