A Linguistic Investigation of Machine Learning based Contradiction
Detection Models: An Empirical Analysis and Future Perspectives
- URL: http://arxiv.org/abs/2210.10434v1
- Date: Wed, 19 Oct 2022 10:06:03 GMT
- Title: A Linguistic Investigation of Machine Learning based Contradiction
Detection Models: An Empirical Analysis and Future Perspectives
- Authors: Maren Pielka, Felix Rode, Lisa Pucknat, Tobias Deu{\ss}er, Rafet Sifa
- Abstract summary: We analyze two Natural Language Inference data sets with respect to their linguistic features.
The goal is to identify those syntactic and semantic properties that are particularly hard to comprehend for a machine learning model.
- Score: 0.34998703934432673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze two Natural Language Inference data sets with respect to their
linguistic features. The goal is to identify those syntactic and semantic
properties that are particularly hard to comprehend for a machine learning
model. To this end, we also investigate the differences between a
crowd-sourced, machine-translated data set (SNLI) and a collection of text
pairs from internet sources. Our main findings are, that the model has
difficulty recognizing the semantic importance of prepositions and verbs,
emphasizing the importance of linguistically aware pre-training tasks.
Furthermore, it often does not comprehend antonyms and homonyms, especially if
those are depending on the context. Incomplete sentences are another problem,
as well as longer paragraphs and rare words or phrases. The study shows that
automated language understanding requires a more informed approach, utilizing
as much external knowledge as possible throughout the training process.
Related papers
- Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement [1.4335183427838039]
We take the approach of developing curated synthetic data on a large scale, with specific properties.
We use a new multiple-choice task and datasets, Blackbird Language Matrices, to focus on a specific grammatical structural phenomenon.
We show that despite having been trained on multilingual texts in a consistent manner, multilingual pretrained language models have language-specific differences.
arXiv Detail & Related papers (2024-09-10T14:58:55Z) - Reframing linguistic bootstrapping as joint inference using visually-grounded grammar induction models [31.006803764376475]
Semantic and syntactic bootstrapping posit that children use their prior knowledge of one linguistic domain, say syntactic relations, to help later acquire another, such as the meanings of new words.
Here, we argue that they are instead both contingent on a more general learning strategy for language acquisition: joint learning.
Using a series of neural visually-grounded grammar induction models, we demonstrate that both syntactic and semantic bootstrapping effects are strongest when syntax and semantics are learnt simultaneously.
arXiv Detail & Related papers (2024-06-17T18:01:06Z) - Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model Training [56.74440457571821]
We analyze tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds.
We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize.
Our findings have implications for model interpretability, multi-task learning, and learning from limited data.
arXiv Detail & Related papers (2023-10-25T09:09:55Z) - An Empirical Revisiting of Linguistic Knowledge Fusion in Language
Understanding Tasks [33.765874588342285]
Infusing language models with syntactic or semantic knowledge from structural linguistic priors has shown improvements on many language understanding tasks.
We conduct empirical study of replacing parsed graphs or trees with trivial ones for tasks in the GLUE benchmark.
It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers.
arXiv Detail & Related papers (2022-10-24T07:47:32Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Unravelling Interlanguage Facts via Explainable Machine Learning [10.71581852108984]
We focus on the internals of an NLI classifier trained by an emphexplainable machine learning algorithm.
We use this perspective in order to tackle both NLI and a companion task, guessing whether a text has been written by a native or a non-native speaker.
We investigate which kind of linguistic traits are most effective for solving our two tasks, namely, are most indicative of a speaker's L1.
arXiv Detail & Related papers (2022-08-02T14:05:15Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - Understanding and Enhancing the Use of Context for Machine Translation [2.367786892039871]
This thesis focuses on understanding certain potentials of contexts in neural models and design augmentation models to benefit from them.
To translate from a source language to a target language, a neural model has to understand the meaning of constituents in the provided context.
Looking more in-depth into the role of context and the impact of data on learning models is essential to advance the NLP field.
arXiv Detail & Related papers (2021-02-20T20:19:27Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.