Towards Relation Extraction From Speech
- URL: http://arxiv.org/abs/2210.08759v1
- Date: Mon, 17 Oct 2022 05:53:49 GMT
- Title: Towards Relation Extraction From Speech
- Authors: Tongtong Wu, Guitao Wang, Jinming Zhao, Zhaoran Liu, Guilin Qi,
Yuan-Fang Li, Gholamreza Haffari
- Abstract summary: We propose a new listening information extraction task, i.e., speech relation extraction.
We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers.
We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations.
- Score: 56.36416922396724
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Relation extraction typically aims to extract semantic relationships between
entities from the unstructured text. One of the most essential data sources for
relation extraction is the spoken language, such as interviews and dialogues.
However, the error propagation introduced in automatic speech recognition (ASR)
has been ignored in relation extraction, and the end-to-end speech-based
relation extraction method has been rarely explored. In this paper, we propose
a new listening information extraction task, i.e., speech relation extraction.
We construct the training dataset for speech relation extraction via
text-to-speech systems, and we construct the testing dataset via crowd-sourcing
with native English speakers. We explore speech relation extraction via two
approaches: the pipeline approach conducting text-based extraction with a
pretrained ASR module, and the end2end approach via a new proposed
encoder-decoder model, or what we called SpeechRE. We conduct comprehensive
experiments to distinguish the challenges in speech relation extraction, which
may shed light on future explorations. We share the code and data on
https://github.com/wutong8023/SpeechRE.
Related papers
- Learning Speech Representation From Contrastive Token-Acoustic
Pretraining [57.08426714676043]
We propose "Contrastive Token-Acoustic Pretraining (CTAP)", which uses two encoders to bring phoneme and speech into a joint multimodal space.
The proposed CTAP model is trained on 210k speech and phoneme pairs, achieving minimally-supervised TTS, VC, and ASR.
arXiv Detail & Related papers (2023-09-01T12:35:43Z) - ESSumm: Extractive Speech Summarization from Untranscribed Meeting [7.309214379395552]
We propose a novel architecture for direct extractive speech-to-speech summarization, ESSumm.
We leverage the off-the-shelf self-supervised convolutional neural network to extract the deep speech features from raw audio.
Our approach automatically predicts the optimal sequence of speech segments that capture the key information with a target summary length.
arXiv Detail & Related papers (2022-09-14T20:13:15Z) - Leveraging Acoustic Contextual Representation by Audio-textual
Cross-modal Learning for Conversational ASR [25.75615870266786]
We propose an audio-textual cross-modal representation extractor to learn contextual representations directly from preceding speech.
The effectiveness of the proposed approach is validated on several Mandarin conversation corpora.
arXiv Detail & Related papers (2022-07-03T13:32:24Z) - End-to-End Active Speaker Detection [58.7097258722291]
We propose an end-to-end training network where feature learning and contextual predictions are jointly learned.
We also introduce intertemporal graph neural network (iGNN) blocks, which split the message passing according to the main sources of context in the ASD problem.
Experiments show that the aggregated features from the iGNN blocks are more suitable for ASD, resulting in state-of-the art performance.
arXiv Detail & Related papers (2022-03-27T08:55:28Z) - RelationPrompt: Leveraging Prompts to Generate Synthetic Data for
Zero-Shot Relation Triplet Extraction [65.4337085607711]
We introduce the task setting of Zero-Shot Relation Triplet Extraction (ZeroRTE)
Given an input sentence, each extracted triplet consists of the head entity, relation label, and tail entity where the relation label is not seen at the training stage.
We propose to synthesize relation examples by prompting language models to generate structured texts.
arXiv Detail & Related papers (2022-03-17T05:55:14Z) - Extracting and filtering paraphrases by bridging natural language
inference and paraphrasing [0.0]
We propose a novel methodology for the extraction of paraphrasing datasets from NLI datasets and cleaning existing paraphrasing datasets.
The results show high quality of extracted paraphrasing datasets and surprisingly high noise levels in two existing paraphrasing datasets.
arXiv Detail & Related papers (2021-11-13T14:06:37Z) - Keyword Extraction for Improved Document Retrieval in Conversational
Search [10.798537120200006]
Mixed-initiative conversational search provides enormous advantages.
incorporating additional information provided by the user from the conversation poses some challenges.
We have collected two conversational keyword extraction datasets and propose an end-to-end document retrieval pipeline incorporating them.
arXiv Detail & Related papers (2021-09-13T13:55:37Z) - D-REX: Dialogue Relation Extraction with Explanations [65.3862263565638]
This work focuses on extracting explanations that indicate that a relation exists while using only partially labeled data.
We propose our model-agnostic framework, D-REX, a policy-guided semi-supervised algorithm that explains and ranks relations.
We find that about 90% of the time, human annotators prefer D-REX's explanations over a strong BERT-based joint relation extraction and explanation model.
arXiv Detail & Related papers (2021-09-10T22:30:48Z) - Improving Sentence-Level Relation Extraction through Curriculum Learning [7.117139527865022]
We propose a curriculum learning-based relation extraction model that split data by difficulty and utilize it for learning.
In the experiments with the representative sentence-level relation extraction datasets, TACRED and Re-TACRED, the proposed method showed good performances.
arXiv Detail & Related papers (2021-07-20T08:44:40Z) - Continuous speech separation: dataset and analysis [52.10378896407332]
In natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components.
This paper describes a dataset and protocols for evaluating continuous speech separation algorithms.
arXiv Detail & Related papers (2020-01-30T18:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.