Assessing the Limits of the Distributional Hypothesis in Semantic
Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
- URL: http://arxiv.org/abs/2205.07603v1
- Date: Mon, 16 May 2022 12:09:40 GMT
- Title: Assessing the Limits of the Distributional Hypothesis in Semantic
Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
- Authors: Mark Anderson and Jose Camacho-Collados
- Abstract summary: This work contributes to the relatively untrodden path of what is required in data for models to capture meaningful representations of natural language.
This entails evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge.
- Score: 6.994580267603235
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The increase in performance in NLP due to the prevalence of distributional
models and deep learning has brought with it a reciprocal decrease in
interpretability. This has spurred a focus on what neural networks learn about
natural language with less of a focus on how. Some work has focused on the data
used to develop data-driven models, but typically this line of work aims to
highlight issues with the data, e.g. highlighting and offsetting harmful
biases. This work contributes to the relatively untrodden path of what is
required in data for models to capture meaningful representations of natural
language. This entails evaluating how well English and Spanish semantic spaces
capture a particular type of relational knowledge, namely the traits associated
with concepts (e.g. bananas-yellow), and exploring the role of co-occurrences
in this context.
Related papers
- Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs)
In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt.
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Contextualization and Generalization in Entity and Relation Extraction [0.0]
We study the behaviour of state-of-the-art models regarding generalization to facts unseen during training.
Traditional benchmarks present important lexical overlap between mentions and relations used for training and evaluating models.
We propose empirical studies to separate performance based on mention and relation overlap with the training set.
arXiv Detail & Related papers (2022-06-15T14:16:42Z) - Leveraging Relational Information for Learning Weakly Disentangled
Representations [11.460692362624533]
Disentanglement is a difficult property to enforce in neural representations.
We present an alternative view over learning (weakly) disentangled representations.
arXiv Detail & Related papers (2022-05-20T09:58:51Z) - Sorting through the noise: Testing robustness of information processing
in pre-trained language models [5.371816551086117]
This paper examines robustness of models' ability to deploy relevant context information in the face of distracting content.
We find that although models appear in simple contexts to make predictions based on understanding and applying relevant facts from prior context, the presence of distracting but irrelevant content has clear impact in confusing model predictions.
arXiv Detail & Related papers (2021-09-25T16:02:23Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Refining Neural Networks with Compositional Explanations [31.84868477264624]
We propose to refine a learned model by collecting human-provided compositional explanations on the models' failure cases.
We demonstrate the effectiveness of the proposed approach on two text classification tasks.
arXiv Detail & Related papers (2021-03-18T17:48:54Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models.
We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest.
A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z) - Neural Data-to-Text Generation via Jointly Learning the Segmentation and
Correspondence [48.765579605145454]
We propose to explicitly segment target text into fragment units and align them with their data correspondences.
The resulting architecture maintains the same expressive power as neural attention models.
On both E2E and WebNLG benchmarks, we show the proposed model consistently outperforms its neural attention counterparts.
arXiv Detail & Related papers (2020-05-03T14:28:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.