Unifying Corroborative and Contributive Attributions in Large Language
Models
- URL: http://arxiv.org/abs/2311.12233v1
- Date: Mon, 20 Nov 2023 23:17:20 GMT
- Title: Unifying Corroborative and Contributive Attributions in Large Language
Models
- Authors: Theodora Worledge, Judy Hanwen Shen, Nicole Meister, Caleb Winston,
Carlos Guestrin
- Abstract summary: We argue for and present a unified framework of large language model attributions.
We show how existing methods of different types of attribution fall under the unified framework.
We believe that this unified framework will guide the use case driven development of systems that leverage both types of attribution, as well as the standardization of their evaluation.
- Score: 11.376900121298517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As businesses, products, and services spring up around large language models,
the trustworthiness of these models hinges on the verifiability of their
outputs. However, methods for explaining language model outputs largely fall
across two distinct fields of study which both use the term "attribution" to
refer to entirely separate techniques: citation generation and training data
attribution. In many modern applications, such as legal document generation and
medical question answering, both types of attributions are important. In this
work, we argue for and present a unified framework of large language model
attributions. We show how existing methods of different types of attribution
fall under the unified framework. We also use the framework to discuss
real-world use cases where one or both types of attributions are required. We
believe that this unified framework will guide the use case driven development
of systems that leverage both types of attribution, as well as the
standardization of their evaluation.
Related papers
- Context versus Prior Knowledge in Language Models [49.17879668110546]
Language models often need to integrate prior knowledge learned during pretraining and new information presented in context.
We propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity.
arXiv Detail & Related papers (2024-04-06T13:46:53Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Multilingual Conceptual Coverage in Text-to-Image Models [98.80343331645626]
"Conceptual Coverage Across Languages" (CoCo-CroLa) is a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns.
For each model we can assess "conceptual coverage" of a given target language relative to a source language by comparing the population of images generated for a series of tangible nouns in the source language to the population of images generated for each noun under translation in the target language.
arXiv Detail & Related papers (2023-06-02T17:59:09Z) - Topics in Contextualised Attention Embeddings [7.6650522284905565]
Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation.
The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics.
Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters.
arXiv Detail & Related papers (2023-01-11T07:26:19Z) - Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation
and Instance Generation [36.541309948222306]
We study the problem of few-shot Fine-grained Entity Typing (FET), where only a few annotated entity mentions with contexts are given for each entity type.
We propose a novel framework for few-shot FET consisting of two modules: (1) an entity type label interpretation module automatically learns to relate type labels to the vocabulary by jointly leveraging few-shot instances and the label hierarchy, and (2) a type-based contextualized instance generator produces new instances based on given instances to enlarge the training set for better generalization.
arXiv Detail & Related papers (2022-06-28T04:05:40Z) - Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models.
A collection of pretrained encoders perceive diverse modalities (such as vision, and language)
We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z) - Lex Rosetta: Transfer of Predictive Models Across Languages,
Jurisdictions, and Legal Domains [40.58709137006848]
We analyze the use of Language-Agnostic Sentence Representations in sequence labeling models using Gated Recurrent Units (GRUs) that are transferable across languages.
We found that models generalize beyond the contexts on which they were trained.
We found that training the models on multiple contexts increases robustness and improves overall performance when evaluating on previously unseen contexts.
arXiv Detail & Related papers (2021-12-15T04:53:13Z) - Bilingual Topic Models for Comparable Corpora [9.509416095106491]
We propose a binding mechanism between the distributions of the paired documents.
To estimate the similarity of documents that are written in different languages we use cross-lingual word embeddings that are learned with shallow neural networks.
We evaluate the proposed binding mechanism by extending two topic models: a bilingual adaptation of LDA that assumes bag-of-words inputs and a model that incorporates part of the text structure in the form of boundaries of semantically coherent segments.
arXiv Detail & Related papers (2021-11-30T10:53:41Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.