Related papers: Demystifying Embedding Spaces using Large Language Models

Demystifying Embedding Spaces using Large Language Models

URL: http://arxiv.org/abs/2310.04475v2
Date: Wed, 13 Mar 2024 17:40:04 GMT
Title: Demystifying Embedding Spaces using Large Language Models
Authors: Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier
Abstract summary: This paper addresses the challenge of making embeddings more interpretable and broadly useful. By employing Large Language Models (LLMs) to directly interact with embeddings, we transform abstract vectors into understandable narratives. We demonstrate our approach on a variety of diverse tasks, including: enhancing concept activation vectors (CAVs), communicating novel embedded entities, and decoding user preferences in recommender systems.
Score: 26.91321899603332
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format. Nevertheless, they often preclude direct interpretation. While downstream tasks make use of these compressed representations, meaningful interpretation usually requires visualization using dimensionality reduction or specialized machine learning interpretability methods. This paper addresses the challenge of making such embeddings more interpretable and broadly useful, by employing Large Language Models (LLMs) to directly interact with embeddings -- transforming abstract vectors into understandable narratives. By injecting embeddings into LLMs, we enable querying and exploration of complex embedding data. We demonstrate our approach on a variety of diverse tasks, including: enhancing concept activation vectors (CAVs), communicating novel embedded entities, and decoding user preferences in recommender systems. Our work couples the immense information potential of embeddings with the interpretative power of LLMs.

Related papers

Exploring Multimodal Prompt for Visualization Authoring with Large Language Models [12.43647167483504]
We study how large language models (LLMs) interpret ambiguous or incomplete text prompts in the context of visualization authoring. We introduce visual prompts as a complementary input modality to text prompts, which help clarify user intent. We design VisPilot, which enables users to easily create visualizations using multimodal prompts, including text, sketches, and direct manipulations.
arXiv Detail & Related papers (2025-04-18T14:00:55Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
Scalable Representation Learning for Multimodal Tabular Transactions [14.18267117657451]
We present an innovative and scalable solution to these challenges. We propose a parameter efficient decoder that interleaves transaction and text modalities. We validate the efficacy of our solution on a large-scale dataset of synthetic payments transactions.
arXiv Detail & Related papers (2024-10-10T12:18:42Z)
EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models [80.00303150568696]
We propose a novel Multimodal Large Language Models (MLLM) that empowers comprehension of arbitrary referring visual prompts with less training efforts than existing approaches. Our approach embeds referring visual prompts as spatial concepts conveying specific spatial areas comprehensible to the MLLM. We also propose a Geometry-Agnostic Learning paradigm (GAL) to further disentangle the MLLM's region-level comprehension with the specific formats of referring visual prompts.
arXiv Detail & Related papers (2024-09-25T08:22:00Z)
Disentangling Dense Embeddings with Sparse Autoencoders [0.0]
Sparse autoencoders (SAEs) have shown promise in extracting interpretable features from complex neural networks. We present one of the first applications of SAEs to dense text embeddings from large language models. We show that the resulting sparse representations maintain semantic fidelity while offering interpretability.
arXiv Detail & Related papers (2024-08-01T15:46:22Z)
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge [76.45868419402265]
multimodal large language models (MLLMs) have made significant strides by training on vast high-quality image-text datasets. However, the inherent difficulty in explicitly conveying fine-grained or spatially dense information in text, such as masks, poses a challenge for MLLMs. This paper proposes a new visual prompt approach to integrate fine-grained external knowledge, gleaned from specialized vision models, into MLLMs.
arXiv Detail & Related papers (2024-07-05T17:43:30Z)
ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension [71.03445074045092]
We propose ClawMachine, offering a new methodology that explicitly notates each entity using token collectives groups of visual tokens. Our method unifies the prompt and answer of visual referential tasks without using additional syntax. ClawMachine achieves superior performance on scene-level and referential understanding tasks with higher efficiency.
arXiv Detail & Related papers (2024-06-17T08:39:16Z)
RelationVLM: Making Large Vision-Language Models Understand Visual Relations [66.70252936043688]
We present RelationVLM, a large vision-language model capable of comprehending various levels and types of relations whether across multiple images or within a video. Specifically, we devise a multi-stage relation-aware training scheme and a series of corresponding data configuration strategies to bestow RelationVLM with the capabilities of understanding semantic relations.
arXiv Detail & Related papers (2024-03-19T15:01:19Z)
Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks. The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human. These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)
IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations [11.008412414253662]
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics. Recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs. We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations.
arXiv Detail & Related papers (2023-06-24T05:02:34Z)
Relate to Predict: Towards Task-Independent Knowledge Representations for Reinforcement Learning [11.245432408899092]
Reinforcement Learning can enable agents to learn complex tasks. It is difficult to interpret the knowledge and reuse it across tasks. In this paper, we introduce an inductive bias for explicit object-centered knowledge separation. We show that the degree of explicitness in knowledge separation correlates with faster learning, better accuracy, better generalization, and better interpretability.
arXiv Detail & Related papers (2022-12-10T13:33:56Z)
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond [2.9005223064604078]
Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information. We introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD.
arXiv Detail & Related papers (2021-05-26T10:14:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.