Related papers: Dealing with Semantic Underspecification in Multimodal NLP

Dealing with Semantic Underspecification in Multimodal NLP

URL: http://arxiv.org/abs/2306.05240v1
Date: Thu, 8 Jun 2023 14:39:24 GMT
Title: Dealing with Semantic Underspecification in Multimodal NLP
Authors: Sandro Pezzelle
Abstract summary: Intelligent systems that aim at mastering language as humans do must deal with its semantic underspecification. Standard NLP models have, in principle, no or limited access to such extra information. multimodal systems grounding language into other modalities, such as vision, are naturally equipped to account for this phenomenon.
Score: 3.5846770619764423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intelligent systems that aim at mastering language as humans do must deal with its semantic underspecification, namely, the possibility for a linguistic signal to convey only part of the information needed for communication to succeed. Consider the usages of the pronoun they, which can leave the gender and number of its referent(s) underspecified. Semantic underspecification is not a bug but a crucial language feature that boosts its storage and processing efficiency. Indeed, human speakers can quickly and effortlessly integrate semantically-underspecified linguistic signals with a wide range of non-linguistic information, e.g., the multimodal context, social or cultural conventions, and shared knowledge. Standard NLP models have, in principle, no or limited access to such extra information, while multimodal systems grounding language into other modalities, such as vision, are naturally equipped to account for this phenomenon. However, we show that they struggle with it, which could negatively affect their performance and lead to harmful consequences when used for applications. In this position paper, we argue that our community should be aware of semantic underspecification if it aims to develop language technology that can successfully interact with human users. We discuss some applications where mastering it is crucial and outline a few directions toward achieving this goal.

Related papers

Real-Time Multilingual Sign Language Processing [4.626189039960495]
Sign Language Processing (SLP) is an interdisciplinary field comprised of Natural Language Processing (NLP) and Computer Vision. Traditional approaches have often been constrained by the use of gloss-based systems that are both language-specific and inadequate for capturing the multidimensional nature of sign language. We propose the use of SignWiring, a universal sign language transcription notation system, to serve as an intermediary link between the visual-gestural modality of signed languages and text-based linguistic representations.
arXiv Detail & Related papers (2024-12-02T21:51:41Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs) It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs. It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z)
Natural Language Processing RELIES on Linguistics [13.142686158720021]
We argue the acronym RELIES that encapsulates six major facets where linguistics contributes to NLP. This list is not exhaustive, nor is linguistics the main point of reference for every effort under these themes.
arXiv Detail & Related papers (2024-05-09T17:59:32Z)
A Taxonomy of Ambiguity Types for NLP [53.10379645698917]
We propose a taxonomy of ambiguity types as seen in English to facilitate NLP analysis. Our taxonomy can help make meaningful splits in language ambiguity data, allowing for more fine-grained assessments of both datasets and model performance.
arXiv Detail & Related papers (2024-03-21T01:47:22Z)
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining? [34.609984453754656]
We aim to elucidate the impact of comprehensive linguistic knowledge, including semantic expression and syntactic structure, on multimodal alignment. Specifically, we design and release the SNARE, the first large-scale multimodal alignment probing benchmark.
arXiv Detail & Related papers (2023-08-24T16:17:40Z)
Uncertainty in Natural Language Generation: From Theory to Applications [42.55924708592451]
We argue that a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals. We first present the fundamental theory, frameworks and vocabulary required to represent uncertainty. We then propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy.
arXiv Detail & Related papers (2023-07-28T17:51:21Z)
Reasoning over the Air: A Reasoning-based Implicit Semantic-Aware Communication Framework [124.6509194665514]
A novel implicit semantic-aware communication (iSAC) architecture is proposed for representing, communicating, and interpreting the implicit semantic meaning between source and destination users. A projection-based semantic encoder is proposed to convert the high-dimensional graphical representation of explicit semantics into a low-dimensional semantic constellation space for efficient physical channel transmission. A generative adversarial imitation learning-based solution, called G-RML, is proposed to enable the destination user to learn and imitate the implicit semantic reasoning process of source user.
arXiv Detail & Related papers (2023-06-20T01:32:27Z)
Imitation Learning-based Implicit Semantic-aware Communication Networks: Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy. We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate. We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z)
On the cross-lingual transferability of multilingual prototypical models across NLU tasks [2.44288434255221]
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications. In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages. This article proposes to investigate the cross-lingual transferability of using synergistically few-shot learning with prototypical neural networks and multilingual Transformers-based models.
arXiv Detail & Related papers (2022-07-19T09:55:04Z)
Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages. We infer this distribution from a sample of typologically diverse training languages. We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.