Related papers: Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction

Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction

URL: http://arxiv.org/abs/2212.10728v1
Date: Wed, 21 Dec 2022 02:47:52 GMT
Title: Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction
Authors: Soyeon Caren Han, Siqu Long, Henry Weld, Josiah Poon
Abstract summary: This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques. We will describe how the machine uses the latest NLP and Deep Learning techniques to address the joint task.
Score: 5.829344935864271
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When a human communicates with a machine using natural language on the web and online, how can it understand the human's intention and semantic context of their talk? This is an important AI task as it enables the machine to construct a sensible answer or perform a useful action for the human. Meaning is represented at the sentence level, identification of which is known as intent detection, and at the word level, a labelling task called slot filling. This dual-level joint task requires innovative thinking about natural language and deep learning network design, and as a result, many approaches and models have been proposed and applied. This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques. We will cover the datasets, experiments and metrics used in the field. We will describe how the machine uses the latest NLP and Deep Learning techniques to address the joint task, including recurrent and attention-based Transformer networks and pre-trained models (e.g. BERT). We will then look in detail at a network that allows the two levels of the task, intent classification and slot filling, to interact to boost performance explicitly. We will do a code demonstration of a Python notebook for this model and attendees will have an opportunity to watch coding demo tasks on this joint NLU to further their understanding.

Related papers

Deep Learning and Machine Learning -- Natural Language Processing: From Theory to Application [17.367710635990083]
We focus on natural language processing (NLP) and the role of large language models (LLMs) This paper discusses advanced data preprocessing techniques and the use of frameworks like Hugging Face for implementing transformer-based models. It highlights challenges such as handling multilingual data, reducing bias, and ensuring model robustness.
arXiv Detail & Related papers (2024-10-30T09:35:35Z)
ClawMachine: Fetching Visual Tokens as An Entity for Referring and Grounding [67.63933036920012]
Existing methods, including proxy encoding and geometry encoding, incorporate additional syntax to encode the object's location. This study presents ClawMachine, offering a new methodology that notates an entity directly using the visual tokens. ClawMachine unifies visual referring and grounding into an auto-regressive format and learns with a decoder-only architecture.
arXiv Detail & Related papers (2024-06-17T08:39:16Z)
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models [70.82705830137708]
We introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL) We utilize semi-language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data. DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.
arXiv Detail & Related papers (2022-11-21T18:56:00Z)
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances [119.29555551279155]
Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions.
arXiv Detail & Related papers (2022-04-04T17:57:11Z)
Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation [86.26522210882699]
We propose Unified multimodal pre-training for both Vision-Language understanding and generation. The proposed UniVL is capable of handling both understanding tasks and generative tasks. Our experiments show that there is a trade-off between understanding tasks and generation tasks while using the same model.
arXiv Detail & Related papers (2021-12-10T14:59:06Z)
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI. The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL) We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z)
Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction. We propose to leverage offline robot datasets with crowd-sourced natural language labels. We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z)
AttViz: Online exploration of self-attention for transparent neural language modeling [7.574392147428978]
We propose AttViz, an online toolkit for exploration of self-attention---real values associated with individual text tokens. We show how existing deep learning pipelines can produce outputs suitable for AttViz, offering novel visualizations of the attention heads and their aggregations with minimal effort, online.
arXiv Detail & Related papers (2020-05-12T12:21:40Z)
From text saliency to linguistic objects: learning linguistic interpretable markers with a multi-channels convolutional architecture [2.064612766965483]
We propose a novel approach to inspect the hidden layers of a fitted CNN in order to extract interpretable linguistic objects from texts exploiting classification process. We empirically demonstrate the efficiency of our approach on corpora from two different languages: English and French.
arXiv Detail & Related papers (2020-04-07T10:46:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.