Goal Driven Discovery of Distributional Differences via Language
Descriptions
- URL: http://arxiv.org/abs/2302.14233v2
- Date: Wed, 25 Oct 2023 00:43:04 GMT
- Title: Goal Driven Discovery of Distributional Differences via Language
Descriptions
- Authors: Ruiqi Zhong, Peter Zhang, Steve Li, Jinwoo Ahn, Dan Klein, Jacob
Steinhardt
- Abstract summary: Mining large corpora can generate useful discoveries but is time-consuming for humans.
We formulate a new task, D5, that automatically discovers differences between two large corpora in a goal-driven way.
Our system produces discoveries previously unknown to the authors on a wide range of applications in OpenD5.
- Score: 58.764821647036946
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mining large corpora can generate useful discoveries but is time-consuming
for humans. We formulate a new task, D5, that automatically discovers
differences between two large corpora in a goal-driven way. The task input is a
problem comprising a research goal "$\textit{comparing the side effects of drug
A and drug B}$" and a corpus pair (two large collections of patients'
self-reported reactions after taking each drug). The output is a language
description (discovery) of how these corpora differ (patients taking drug A
"$\textit{mention feelings of paranoia}$" more often). We build a D5 system,
and to quantitatively measure its performance, we 1) contribute a meta-dataset,
OpenD5, aggregating 675 open-ended problems ranging across business, social
sciences, humanities, machine learning, and health, and 2) propose a set of
unified evaluation metrics: validity, relevance, novelty, and significance.
With the dataset and the unified metrics, we confirm that language models can
use the goals to propose more relevant, novel, and significant candidate
discoveries. Finally, our system produces discoveries previously unknown to the
authors on a wide range of applications in OpenD5, including temporal and
demographic differences in discussion topics, political stances and stereotypes
in speech, insights in commercial reviews, and error patterns in NLP models.
Related papers
- Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations [3.3018718917393297]
We propose the novel task of Question Answering with source citation in ambiguous settings, where multiple valid answers exist.
We create a comprehensive framework consisting of: (1) five novel datasets; (2) the first ambiguous multi-hop QA dataset featuring real-world, naturally occurring contexts; and (3) two new metrics to evaluate models' performances.
We hope that this new task, datasets, metrics, and baselines will inspire the community to push the boundaries of QA research and develop more trustworthy and interpretable systems.
arXiv Detail & Related papers (2024-10-05T17:37:01Z) - Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection [9.788417605537965]
We introduce a novel end-to-end open vocabulary HOI detection framework with conditional multi-level decoding and fine-grained semantic enhancement.
Our proposed method achieves state-of-the-art results in open vocabulary HOI detection.
arXiv Detail & Related papers (2024-04-09T10:27:22Z) - SADAS: A Dialogue Assistant System Towards Remediating Norm Violations
in Bilingual Socio-Cultural Conversations [56.31816995795216]
Socially-Aware Dialogue Assistant System (SADAS) is designed to ensure that conversations unfold with respect and understanding.
Our system's novel architecture includes: (1) identifying the categories of norms present in the dialogue, (2) detecting potential norm violations, (3) evaluating the severity of these violations, and (4) implementing targeted remedies to rectify the breaches.
arXiv Detail & Related papers (2024-01-29T08:54:21Z) - Large Language Models are Fixated by Red Herrings: Exploring Creative
Problem Solving and Einstellung Effect using the Only Connect Wall Dataset [4.789429120223149]
The quest for human imitative AI has been an enduring topic in AI research since its inception.
Creative problem solving in humans is a well-studied topic in cognitive neuroscience.
Only Connect Wall segment essentially mimics Mednick's Remote Associates Test (RAT) formulation with built-in, deliberate red herrings.
arXiv Detail & Related papers (2023-06-19T21:14:57Z) - DERA: Enhancing Large Language Model Completions with Dialog-Enabled
Resolving Agents [5.562984399879218]
Large language models (LLMs) have emerged as valuable tools for many natural language understanding tasks.
In this work, we present dialog-enabled resolving agents (DERA)
DERA is a paradigm made possible by the increased conversational abilities of LLMs, namely GPT-4.
It provides a simple, interpretable forum for models to communicate feedback and iteratively improve output.
arXiv Detail & Related papers (2023-03-30T00:30:19Z) - RuMedBench: A Russian Medical Language Understanding Benchmark [58.99199480170909]
The paper describes the open Russian medical language understanding benchmark covering several task types.
We prepare the unified format labeling, data split, and evaluation metrics for new tasks.
A single-number metric expresses a model's ability to cope with the benchmark.
arXiv Detail & Related papers (2022-01-17T16:23:33Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z) - The Paradigm Discovery Problem [121.79963594279893]
We formalize the paradigm discovery problem and develop metrics for judging systems.
We report empirical results on five diverse languages.
Our code and data are available for public use.
arXiv Detail & Related papers (2020-05-04T16:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.