Curriculum Learning for Goal-Oriented Semantic Communications with a
Common Language
- URL: http://arxiv.org/abs/2204.10429v1
- Date: Thu, 21 Apr 2022 22:36:06 GMT
- Title: Curriculum Learning for Goal-Oriented Semantic Communications with a
Common Language
- Authors: Mohammad Karimzadeh Farshbafan, Walid Saad, and Merouane Debbah
- Abstract summary: A holistic goal-oriented semantic communication framework is proposed to enable a speaker and a listener to cooperatively execute a set of sequential tasks.
A common language based on a hierarchical belief set is proposed to enable semantic communications between speaker and listener.
An optimization problem is defined to determine the perfect and abstract description of the events.
- Score: 60.85719227557608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Goal-oriented semantic communication will be a pillar of next-generation
wireless networks. Despite significant recent efforts in this area, most prior
works are focused on specific data types (e.g., image or audio), and they
ignore the goal and effectiveness aspects of semantic transmissions. In
contrast, in this paper, a holistic goal-oriented semantic communication
framework is proposed to enable a speaker and a listener to cooperatively
execute a set of sequential tasks in a dynamic environment. A common language
based on a hierarchical belief set is proposed to enable semantic
communications between speaker and listener. The speaker, acting as an observer
of the environment, utilizes the beliefs to transmit an initial description of
its observation (called event) to the listener. The listener is then able to
infer on the transmitted description and complete it by adding related beliefs
to the transmitted beliefs of the speaker. As such, the listener reconstructs
the observed event based on the completed description, and it then takes
appropriate action in the environment based on the reconstructed event. An
optimization problem is defined to determine the perfect and abstract
description of the events while minimizing the transmission and inference costs
with constraints on the task execution time and belief efficiency. Then, a
novel bottom-up curriculum learning (CL) framework based on reinforcement
learning is proposed to solve the optimization problem and enable the speaker
and listener to gradually identify the structure of the belief set and the
perfect and abstract description of the events. Simulation results show that
the proposed CL method outperforms traditional RL in terms of convergence time,
task execution cost and time, reliability, and belief efficiency.
Related papers
- Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance.
We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information.
Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Improving Speaker Diarization using Semantic Information: Joint Pairwise
Constraints Propagation [53.01238689626378]
We propose a novel approach to leverage semantic information in speaker diarization systems.
We introduce spoken language understanding modules to extract speaker-related semantic information.
We present a novel framework to integrate these constraints into the speaker diarization pipeline.
arXiv Detail & Related papers (2023-09-19T09:13:30Z) - Beyond Transmitting Bits: Context, Semantics, and Task-Oriented
Communications [88.68461721069433]
Next generation systems can be potentially enriched by folding message semantics and goals of communication into their design.
This tutorial summarizes the efforts to date, starting from its early adaptations, semantic-aware and task-oriented communications.
The focus is on approaches that utilize information theory to provide the foundations, as well as the significant role of learning in semantics and task-aware communications.
arXiv Detail & Related papers (2022-07-19T16:00:57Z) - Direction-Aware Joint Adaptation of Neural Speech Enhancement and
Recognition in Real Multiparty Conversational Environments [21.493664174262737]
This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication within real multiparty conversational environments.
We propose a semi-supervised adaptation method that jointly updates the mask estimator and the ASR model at run-time using clean speech signals with ground-truth transcriptions and noisy speech signals with highly-confident estimated transcriptions.
arXiv Detail & Related papers (2022-07-15T03:43:35Z) - Learning to Mediate Disparities Towards Pragmatic Communication [9.321336642983875]
We propose Pragmatic Rational Speaker (PRS) as a framework for building AI agents with similar abilities in language communication.
The PRS attempts to learn the speaker-listener disparity and adjust the speech accordingly, by adding a light-weighted disparity adjustment layer into working memory.
By fixing the long-term memory, the PRS only needs to update its working memory to learn and adapt to different types of listeners.
arXiv Detail & Related papers (2022-03-25T14:46:43Z) - Common Language for Goal-Oriented Semantic Communications: A Curriculum
Learning Framework [66.81698651016444]
A comprehensive semantic communications framework is proposed for enabling goal-oriented task execution.
A novel top-down framework that combines curriculum learning (CL) and reinforcement learning (RL) is proposed to solve this problem.
Simulation results show that the proposed CL method outperforms traditional RL in terms of convergence time, task execution time, and transmission cost during training.
arXiv Detail & Related papers (2021-11-15T19:13:55Z) - Pre-training for Spoken Language Understanding with Joint Textual and
Phonetic Representation Learning [4.327558819000435]
We propose a novel joint textual-phonetic pre-training approach for learning spoken language representations.
Experimental results on spoken language understanding benchmarks, Fluent Speech Commands and SNIPS, show that the proposed approach significantly outperforms strong baseline models.
arXiv Detail & Related papers (2021-04-21T05:19:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.