Generic Dependency Modeling for Multi-Party Conversation
- URL: http://arxiv.org/abs/2302.10680v1
- Date: Tue, 21 Feb 2023 13:58:19 GMT
- Title: Generic Dependency Modeling for Multi-Party Conversation
- Authors: Weizhou Shen, Xiaojun Quan, Ke Yang
- Abstract summary: We present an approach to encoding the dependencies in the form of relative dependency encoding (ReDE)
We show how to implement it in Transformers by modifying the computation of self-attention.
- Score: 32.25605889407403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To model the dependencies between utterances in multi-party conversations, we
propose a simple and generic framework based on the dependency parsing results
of utterances. Particularly, we present an approach to encoding the
dependencies in the form of relative dependency encoding (ReDE) and illustrate
how to implement it in Transformers by modifying the computation of
self-attention. Experimental results on four multi-party conversation
benchmarks show that this framework successfully boosts the general performance
of two Transformer-based language models and leads to comparable or even
superior performance compared to the state-of-the-art methods. The codes are
available at https://github.com/shenwzh3/ReDE.
Related papers
- Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models [42.46104516313823]
Dependency Transformer Grammars (DTGs) are a new class of Transformer language model with explicit dependency-based inductive bias.
DTGs simulate dependency transition systems with constrained attention patterns.
They achieve better generalization while maintaining comparable perplexity with Transformer language model baselines.
arXiv Detail & Related papers (2024-07-24T16:38:38Z) - Multi-Convformer: Extending Conformer with Multiple Convolution Kernels [64.4442240213399]
We introduce Multi-Convformer that uses multiple convolution kernels within the convolution module of the Conformer in conjunction with gating.
Our model rivals existing Conformer variants such as CgMLP and E-Branchformer in performance, while being more parameter efficient.
We empirically compare our approach with Conformer and its variants across four different datasets and three different modelling paradigms and show up to 8% relative word error rate(WER) improvements.
arXiv Detail & Related papers (2024-07-04T08:08:12Z) - Branchformer: Parallel MLP-Attention Architectures to Capture Local and
Global Context for Speech Recognition and Understanding [41.928263518867816]
Conformer has proven to be effective in many speech processing tasks.
Inspired by this, we propose a more flexible, interpretable and customizable encoder alternative, Branchformer.
arXiv Detail & Related papers (2022-07-06T21:08:10Z) - BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and
Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing.
We benchmark eight language models, including two GPT-3 variants available only through an API.
Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z) - ReSTR: Convolution-free Referring Image Segmentation Using Transformers [80.9672131755143]
We present the first convolution-free model for referring image segmentation using transformers, dubbed ReSTR.
Since it extracts features of both modalities through transformer encoders, ReSTR can capture long-range dependencies between entities within each modality.
Also, ReSTR fuses features of the two modalities by a self-attention encoder, which enables flexible and adaptive interactions between the two modalities in the fusion process.
arXiv Detail & Related papers (2022-03-31T02:55:39Z) - Unifying Discourse Resources with Dependency Framework [18.498060350460463]
We unify Chinese discourse corpora under different annotation schemes with discourse dependency framework.
We implement several benchmark dependencys and research on how they can leverage the unified data to improve performance.
arXiv Detail & Related papers (2021-01-01T05:23:29Z) - StructFormer: Joint Unsupervised Induction of Dependency and
Constituency Structure from Masked Language Modeling [45.96663013609177]
We introduce a novel model, StructFormer, that can induce dependency and constituency structure at the same time.
We integrate the induced dependency relations into the transformer, in a differentiable manner, through a novel dependency-constrained self-attention mechanism.
Experimental results show that our model can achieve strong results on unsupervised constituency parsing, unsupervised dependency parsing, and masked language modeling.
arXiv Detail & Related papers (2020-12-01T21:54:51Z) - Multi-turn Response Selection using Dialogue Dependency Relations [39.99448321736736]
Multi-turn response selection is a task designed for developing dialogue agents.
We propose a dialogue extraction algorithm to transform a dialogue history into threads based on their dependency relations.
Our model outperforms the state-of-the-art baselines on both D7 and DSTC8*, with competitive results on Ubuntu.
arXiv Detail & Related papers (2020-10-04T08:00:19Z) - GRIT: Generative Role-filler Transformers for Document-level Event
Entity Extraction [134.5580003327839]
We introduce a generative transformer-based encoder-decoder framework (GRIT) to model context at the document level.
We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work.
arXiv Detail & Related papers (2020-08-21T01:07:36Z) - Coreferential Reasoning Learning for Language Representation [88.14248323659267]
We present CorefBERT, a novel language representation model that can capture the coreferential relations in context.
The experimental results show that, compared with existing baseline models, CorefBERT can achieve significant improvements consistently on various downstream NLP tasks.
arXiv Detail & Related papers (2020-04-15T03:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.