On the proper role of linguistically-oriented deep net analysis in
linguistic theorizing
- URL: http://arxiv.org/abs/2106.08694v1
- Date: Wed, 16 Jun 2021 10:57:24 GMT
- Title: On the proper role of linguistically-oriented deep net analysis in
linguistic theorizing
- Authors: Marco Baroni
- Abstract summary: I suggest that deep networks should be treated as theories making explicit predictions about the acceptability of linguistic utterances.
I argue that, if we overcome some obstacles standing in the way of seriously pursuing this idea, we will gain a powerful new theoretical tool.
- Score: 25.64606911182175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A lively research field has recently emerged that uses experimental methods
to probe the linguistic behavior of modern deep networks. While work in this
tradition often reports intriguing results about the grammatical skills of deep
nets, it is not clear what their implications for linguistic theorizing should
be. As a consequence, linguistically-oriented deep net analysis has had very
little impact on linguistics at large. In this chapter, I suggest that deep
networks should be treated as theories making explicit predictions about the
acceptability of linguistic utterances. I argue that, if we overcome some
obstacles standing in the way of seriously pursuing this idea, we will gain a
powerful new theoretical tool, complementary to mainstream algebraic
approaches.
Related papers
- Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [49.09746599881631]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with multilingual-tuned models, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z) - Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism.
Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z) - Predictive Simultaneous Interpretation: Harnessing Large Language Models for Democratizing Real-Time Multilingual Communication [0.0]
We present a novel algorithm that generates real-time translations by predicting speaker utterances and expanding multiple possibilities in a tree-like structure.
Our theoretical analysis, supported by illustrative examples, suggests that this approach could lead to more natural and fluent translations with minimal latency.
arXiv Detail & Related papers (2024-07-02T13:18:28Z) - On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation [71.72465617754553]
We generate "low-level" sentences that convey object-centric, three-dimensional spatial relationships, incorporate them as additional language priors and evaluate their downstream impact on depth estimation.
Our key finding is that current language-guided depth estimators perform optimally only with scene-level descriptions.
Despite leveraging additional data, these methods are not robust to directed adversarial attacks and decline in performance with an increase in distribution shift.
arXiv Detail & Related papers (2024-04-12T15:35:20Z) - A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation [0.0]
This paper explores techniques that focus on understanding and resolving ambiguity in language within the field of natural language processing (NLP)
It outlines diverse approaches ranging from deep learning techniques to leveraging lexical resources and knowledge graphs like WordNet.
The research identifies persistent challenges in the field, such as the scarcity of sense annotated corpora and the complexity of informal clinical texts.
arXiv Detail & Related papers (2024-03-24T12:58:48Z) - Is it possible not to cheat on the Turing Test: Exploring the potential
and challenges for true natural language 'understanding' by computers [0.0]
The area of natural language understanding in artificial intelligence claims to have been making great strides.
A comprehensive, interdisciplinary overview of current approaches and remaining challenges is yet to be carried out.
I unite all of these perspectives to unpack the challenges involved in reaching true (human-like) language understanding.
arXiv Detail & Related papers (2022-06-29T14:19:48Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Annotation Uncertainty in the Context of Grammatical Change [0.05249805590164901]
This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora.
By examining annotation uncertainty in more detail, we identify the sources and deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice.
This article can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science.
arXiv Detail & Related papers (2021-05-15T17:45:29Z) - Interpretable Deep Learning: Interpretations, Interpretability,
Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused.
We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy.
We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z) - Optimism in the Face of Adversity: Understanding and Improving Deep
Learning through Adversarial Robustness [63.627760598441796]
We provide an in-depth review of the field of adversarial robustness in deep learning.
We highlight the intuitive connection between adversarial examples and the geometry of deep neural networks.
We provide an overview of the main emerging applications of adversarial robustness beyond security.
arXiv Detail & Related papers (2020-10-19T16:03:46Z) - A Chain Graph Interpretation of Real-World Neural Networks [58.78692706974121]
We propose an alternative interpretation that identifies NNs as chain graphs (CGs) and feed-forward as an approximate inference procedure.
The CG interpretation specifies the nature of each NN component within the rich theoretical framework of probabilistic graphical models.
We demonstrate with concrete examples that the CG interpretation can provide novel theoretical support and insights for various NN techniques.
arXiv Detail & Related papers (2020-06-30T14:46:08Z) - Syntactic Structure from Deep Learning [40.84740599926241]
Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation.
This success has sparked interest in probing whether these models are inducing human-like grammatical knowledge from the raw data they are exposed to.
arXiv Detail & Related papers (2020-04-22T20:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.