Related papers: Visualizing and Explaining Language Models

Visualizing and Explaining Language Models

URL: http://arxiv.org/abs/2205.10238v1
Date: Sat, 30 Apr 2022 17:23:33 GMT
Title: Visualizing and Explaining Language Models
Authors: Adrian M.P. Bra\c{s}oveanu, R\u{a}zvan Andonie
Abstract summary: Natural Language Processing has become, after Computer Vision, the second field of Artificial Intelligence. This paper showcases the techniques used in some of the most popular Deep Learning for NLP visualizations, with a special focus on interpretability and explainability.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: During the last decade, Natural Language Processing has become, after Computer Vision, the second field of Artificial Intelligence that was massively changed by the advent of Deep Learning. Regardless of the architecture, the language models of the day need to be able to process or generate text, as well as predict missing words, sentences or relations depending on the task. Due to their black-box nature, such models are difficult to interpret and explain to third parties. Visualization is often the bridge that language model designers use to explain their work, as the coloring of the salient words and phrases, clustering or neuron activations can be used to quickly understand the underlying models. This paper showcases the techniques used in some of the most popular Deep Learning for NLP visualizations, with a special focus on interpretability and explainability.

Related papers

Natural Language Generation from Visual Events: Challenges and Future Directions [8.058451580903123]
We argue that any NLG task dealing with sequences of images or frames is an instance of the broader, more general problem of modeling the intricate relationships between visual events unfolding over time.<n>We consider five seemingly different tasks, which we argue are compelling instances of this broader multimodal problem.<n>We claim that improving language-and-vision models' understanding of visual events is both timely and essential, given their growing applications.
arXiv Detail & Related papers (2025-02-18T16:48:18Z)
Using Multimodal Deep Neural Networks to Disentangle Language from Visual Aesthetics [8.749640179057469]
We use linear decoding over the learned representations of unimodal vision, unimodal language, and multimodal deep neural network (DNN) models to predict human beauty ratings of naturalistic images. We show that unimodal vision models (e.g. SimCLR) account for the vast majority of explainable variance in these ratings. Language-aligned vision models (e.g. SLIP) yield small gains relative to unimodal vision. Taken together, these results suggest that whatever words we may eventually find to describe our experience of beauty, the ineffable computations of feedforward perception may provide sufficient foundation for that experience.
arXiv Detail & Related papers (2024-10-31T03:37:21Z)
Natural Language Counterfactual Explanations for Graphs Using Large Language Models [7.560731917128082]
We exploit the power of open-source Large Language Models to generate natural language explanations. We show that our approach effectively produces accurate natural language representations of counterfactual instances.
arXiv Detail & Related papers (2024-10-11T23:06:07Z)
Visual Grounding Helps Learn Word Meanings in Low-Data Regimes [47.7950860342515]
Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension. But to achieve these results, LMs must be trained in distinctly un-human-like ways. Do models trained more naturalistically -- with grounded supervision -- exhibit more humanlike language learning? We investigate this question in the context of word learning, a key sub-task in language acquisition.
arXiv Detail & Related papers (2023-10-20T03:33:36Z)
Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z)
SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models. A meta-model can learn on self-supervised prompts consisting of tailored demonstrations. Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z)
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models [24.456117679941816]
Contrastive Reading Model (Cream) is a novel neural architecture designed to enhance the language-image understanding capability of Large Language Models (LLMs) Our approach bridges the gap between vision and language understanding, paving the way for the development of more sophisticated Document Intelligence Assistants.
arXiv Detail & Related papers (2023-05-24T11:59:13Z)
Training language models for deeper understanding improves brain alignment [5.678337324555035]
Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP) We show that training language models for deeper narrative understanding results in richer representations that have improved alignment to human brain activity.
arXiv Detail & Related papers (2022-12-21T10:15:19Z)
Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models [57.08925810659545]
We conduct a comparative analysis of the visual representations in existing vision-and-language models and vision-only models. Our empirical observations suggest that vision-and-language models are better at label prediction tasks. We hope our study sheds light on the role of language in visual learning, and serves as an empirical guide for various pretrained models.
arXiv Detail & Related papers (2022-12-01T05:00:18Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Imagination-Augmented Natural Language Understanding [71.51687221130925]
We introduce an Imagination-Augmented Cross-modal (iACE) to solve natural language understanding tasks. iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models. Experiments on GLUE and SWAG show that iACE achieves consistent improvement over visually-supervised pre-trained models.
arXiv Detail & Related papers (2022-04-18T19:39:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.