Improving Context Modeling in Neural Topic Segmentation
- URL: http://arxiv.org/abs/2010.03138v1
- Date: Wed, 7 Oct 2020 03:40:49 GMT
- Title: Improving Context Modeling in Neural Topic Segmentation
- Authors: Linzi Xing, Brad Hackinen, Giuseppe Carenini, Francesco Trebbi
- Abstract summary: We enhance a segmenter based on a hierarchical attention BiLSTM network to better model context.
Our optimized segmenter outperforms SOTA approaches when trained and tested on three datasets.
- Score: 18.92944038749279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Topic segmentation is critical in key NLP tasks and recent works favor highly
effective neural supervised approaches. However, current neural solutions are
arguably limited in how they model context. In this paper, we enhance a
segmenter based on a hierarchical attention BiLSTM network to better model
context, by adding a coherence-related auxiliary task and restricted
self-attention. Our optimized segmenter outperforms SOTA approaches when
trained and tested on three datasets. We also the robustness of our proposed
model in domain transfer setting by training a model on a large-scale dataset
and testing it on four challenging real-world benchmarks. Furthermore, we apply
our proposed strategy to two other languages (German and Chinese), and show its
effectiveness in multilingual scenarios.
Related papers
- A Large-Scale Evaluation of Speech Foundation Models [110.95827399522204]
We establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the foundation model paradigm for speech.
We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads.
arXiv Detail & Related papers (2024-04-15T00:03:16Z) - On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based
Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models.
We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z) - Exploiting Multilingualism in Low-resource Neural Machine Translation
via Adversarial Learning [3.2258463207097017]
Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT)
In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training.
This article proposes Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence computation.
arXiv Detail & Related papers (2023-03-31T12:34:14Z) - An Empirical Study on Multi-Domain Robust Semantic Segmentation [42.79166534691889]
We train a unified model that is expected to perform well across domains on several popularity segmentation datasets.
Our solution ranks 2nd on RVC 2022 semantic segmentation task, with a dataset only 1/3 size of the 1st model used.
arXiv Detail & Related papers (2022-12-08T12:04:01Z) - Improving Topic Segmentation by Injecting Discourse Dependencies [29.353285741379334]
We present a discourse-aware neural topic segmentation model with the injection of above-sentence discourse dependency structures.
Our empirical study on English evaluation datasets shows that injecting above-sentence discourse structures to a neural topic segmenter can substantially improve its performances.
arXiv Detail & Related papers (2022-09-18T18:22:25Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Towards Trustworthy Deception Detection: Benchmarking Model Robustness
across Domains, Modalities, and Languages [10.131671217810581]
We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.
We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT orGLoVe.
arXiv Detail & Related papers (2021-04-23T18:05:52Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.