CrossTalk: Enhancing Communication and Collaboration in
Videoconferencing with Intent Recognition from Conversational Speech
- URL: http://arxiv.org/abs/2308.03311v1
- Date: Mon, 7 Aug 2023 05:40:01 GMT
- Title: CrossTalk: Enhancing Communication and Collaboration in
Videoconferencing with Intent Recognition from Conversational Speech
- Authors: Haijun Xia, Tony Wang, Aditya Gunturu, Peiling Jiang, William Duan,
Xiaoshuo Yao
- Abstract summary: We envision digital communication media as proactive facilitators that can provide unobtrusive assistance to enhance communication and collaboration.
We propose three key design concepts to explore the systematic integration of intelligence into communication and collaboration.
We developed CrossTalk, a videoconferencing system that instantiates these concepts, which was found to enable a more fluid and flexible communication and collaboration experience.
- Score: 3.333406057333272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the advances and ubiquity of digital communication media such as
videoconferencing and virtual reality, they remain oblivious to the rich
intentions expressed by users. Beyond transmitting audio, videos, and messages,
we envision digital communication media as proactive facilitators that can
provide unobtrusive assistance to enhance communication and collaboration.
Informed by the results of a formative study, we propose three key design
concepts to explore the systematic integration of intelligence into
communication and collaboration, including the panel substrate, language-based
intent recognition, and lightweight interaction techniques. We developed
CrossTalk, a videoconferencing system that instantiates these concepts, which
was found to enable a more fluid and flexible communication and collaboration
experience.
Related papers
- Semantic Communication Enabled Holographic Video Processing and Transmission [80.02919983620494]
This article provides an overview of holographic video communication and outlines the requirements of a holographic video communication system.<n>Key technologies, including semantic sampling, joint semantic-channel coding, and semantic-aware transmission, are designed based on the proposed architecture.
arXiv Detail & Related papers (2025-10-15T11:06:48Z) - Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset [113.25650486482762]
We introduce the Seamless Interaction dataset, a large-scale collection of over 4,000 hours of face-to-face interaction footage.<n>This dataset enables the development of AI technologies that understand dyadic embodied dynamics.<n>We develop a suite of models that utilize the dataset to generate dyadic motion gestures and facial expressions aligned with human speech.
arXiv Detail & Related papers (2025-06-27T18:09:49Z) - Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models [49.22720751953838]
We propose a method for training language models in an interactive setting inspired by child language acquisition.<n>In our setting, a speaker attempts to communicate some information to a listener in a single-turn dialogue and receives a reward if communicative success is achieved.
arXiv Detail & Related papers (2025-05-09T11:48:36Z) - Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication [9.812902134556971]
Speak Ease is an augmentative and alternative communication system to support users' expressivity.
System integrates multimodal input, including text, voice, and contextual cues, with large language models.
arXiv Detail & Related papers (2025-03-21T18:50:05Z) - VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction [105.88658935310605]
We propose a multi-stage training methodology that progressively trains LLM to understand both visual and speech information.
Our approach not only preserves strong vision-language capacity, but also enables efficient speech-to-speech dialogue capabilities.
By comparing our method against state-of-the-art counterparts across benchmarks for image, video, and speech tasks, we demonstrate that our model is equipped with both strong visual and speech capabilities.
arXiv Detail & Related papers (2025-01-03T18:59:52Z) - Pragmatic Communication in Multi-Agent Collaborative Perception [80.14322755297788]
Collaborative perception results in a trade-off between perception ability and communication costs.
We propose PragComm, a multi-agent collaborative perception system with two key components.
PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume.
arXiv Detail & Related papers (2024-01-23T11:58:08Z) - Will 6G be Semantic Communications? Opportunities and Challenges from
Task Oriented and Secure Communications to Integrated Sensing [49.83882366499547]
This paper explores opportunities and challenges of task (goal)-oriented and semantic communications for next-generation (NextG) networks through the integration of multi-task learning.
We employ deep neural networks representing a dedicated encoder at the transmitter and multiple task-specific decoders at the receiver.
We scrutinize potential vulnerabilities stemming from adversarial attacks during both training and testing phases.
arXiv Detail & Related papers (2024-01-03T04:01:20Z) - Learning Multi-Agent Communication with Contrastive Learning [3.816854668079928]
We introduce an alternative perspective where communicative messages are considered as different incomplete views of the environment state.
By examining the relationship between messages sent and received, we propose to learn to communicate using contrastive learning.
In communication-essential environments, our method outperforms previous work in both performance and learning speed.
arXiv Detail & Related papers (2023-07-03T23:51:05Z) - CAMEL: Communicative Agents for "Mind" Exploration of Large Language
Model Society [58.04479313658851]
This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents.
We propose a novel communicative agent framework named role-playing.
Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems.
arXiv Detail & Related papers (2023-03-31T01:09:00Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph:
Principle, Implementation, and Performance Evaluation [74.38561925376996]
Two cognitive semantic communication frameworks are proposed for the single-user and multiple-user communication scenarios.
An effective semantic correction algorithm is proposed by mining the inference rule from the knowledge graph.
For the multi-user cognitive semantic communication system, a message recovery algorithm is proposed to distinguish messages of different users.
arXiv Detail & Related papers (2023-03-15T12:01:43Z) - On the Role of Emergent Communication for Social Learning in Multi-Agent
Reinforcement Learning [0.0]
Social learning uses cues from experts to align heterogeneous policies, reduce sample complexity, and solve partially observable tasks.
This paper proposes an unsupervised method based on the information bottleneck to capture both referential complexity and task-specific utility.
arXiv Detail & Related papers (2023-02-28T03:23:27Z) - Over-communicate no more: Situated RL agents learn concise communication
protocols [78.28898217947467]
It is unclear how to design artificial agents that can learn to effectively and efficiently communicate with each other.
Much research on communication emergence uses reinforcement learning (RL)
We explore situated communication in a multi-step task, where the acting agent has to forgo an environmental action to communicate.
We find that while all tested pressures can disincentivise over-communication, situated communication does it most effectively and, unlike the cost on effort, does not negatively impact emergence.
arXiv Detail & Related papers (2022-11-02T21:08:14Z) - Beyond Transmitting Bits: Context, Semantics, and Task-Oriented
Communications [88.68461721069433]
Next generation systems can be potentially enriched by folding message semantics and goals of communication into their design.
This tutorial summarizes the efforts to date, starting from its early adaptations, semantic-aware and task-oriented communications.
The focus is on approaches that utilize information theory to provide the foundations, as well as the significant role of learning in semantics and task-aware communications.
arXiv Detail & Related papers (2022-07-19T16:00:57Z) - Learning Emergent Discrete Message Communication for Cooperative
Reinforcement Learning [36.468498804251574]
We show that discrete message communication has performance comparable to continuous message communication.
We propose an approach that allows humans to interactively send discrete messages to agents.
arXiv Detail & Related papers (2021-02-24T20:44:14Z) - Effective Communications: A Joint Learning and Communication Framework
for Multi-Agent Reinforcement Learning over Noisy Channels [0.0]
We propose a novel formulation of the "effectiveness problem" in communications.
We consider multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation.
We show via examples that the joint policy learned using the proposed framework is superior to that where the communication is considered separately.
arXiv Detail & Related papers (2021-01-02T10:43:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.