FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
- URL: http://arxiv.org/abs/2211.04699v1
- Date: Wed, 9 Nov 2022 06:18:17 GMT
- Title: FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration
- Authors: Yangjun Wu, Kebin Fang, Yao Zhao, Hao Zhang, Lifeng Shi, Mengqi Zhang
- Abstract summary: We propose a Feature Fusion two-stream framework (FF2) for punctuation restoration.
Specifically, one stream leverages a pre-trained language model to capture the semantic feature, while another auxiliary module captures the feature at hand.
Without additional data, the experimental results on the popular benchmark IWSLT demonstrate that FF2 achieves new SOTA performance.
- Score: 27.14686854704104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To accomplish punctuation restoration, most existing methods focus on
introducing extra information (e.g., part-of-speech) or addressing the class
imbalance problem. Recently, large-scale transformer-based pre-trained language
models (PLMS) have been utilized widely and obtained remarkable success.
However, the PLMS are trained on the large dataset with marks, which may not
fit well with the small dataset without marks, causing the convergence to be
not ideal. In this study, we propose a Feature Fusion two-stream framework
(FF2) to bridge the gap. Specifically, one stream leverages a pre-trained
language model to capture the semantic feature, while another auxiliary module
captures the feature at hand. We also modify the computation of multi-head
attention to encourage communication among heads. Then, two features with
different perspectives are aggregated to fuse information and enhance context
awareness. Without additional data, the experimental results on the popular
benchmark IWSLT demonstrate that FF2 achieves new SOTA performance, which
verifies that our approach is effective.
Related papers
- GMFL-Net: A Global Multi-geometric Feature Learning Network for Repetitive Action Counting [4.117416395116726]
We propose a simple but efficient Global Multi-geometric Feature Learning Network (GMFL-Net)
Specifically, we design a MIA-Module that aims to improve information representation by fusing multi-geometric features.
We also design a GBFL-Module that enhances the inter-dependencies between point-wise and channel-wise elements.
arXiv Detail & Related papers (2024-08-31T02:18:26Z) - A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback.
First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF.
Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z) - Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing [5.3598912592106345]
Deep learning has led to significant advances in bearing fault diagnosis (FD)
We propose a novel FD model by integrating multiscale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiG), and cross self-attention feature fusion (CSAFF)
arXiv Detail & Related papers (2024-05-25T07:55:02Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Towards A Unified View of Sparse Feed-Forward Network in Pretraining
Large Language Model [58.9100867327305]
Large and sparse feed-forward layers (S-FFN) have proven effective in scaling up Transformers model size for textitpretraining large language models.
We analyzed two major design choices of S-FFN: the memory block (a.k.a. expert) size and the memory block selection method.
We found a simpler selection method -- textbftextttAvg-K that selects blocks through their mean aggregated hidden states, achieving lower perplexity in language model pretraining.
arXiv Detail & Related papers (2023-05-23T12:28:37Z) - Magic ELF: Image Deraining Meets Association Learning and Transformer [63.761812092934576]
This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining.
A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery.
Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
arXiv Detail & Related papers (2022-07-21T12:50:54Z) - A Context-Aware Feature Fusion Framework for Punctuation Restoration [28.38472792385083]
We propose a novel Feature Fusion framework based on two-type Attentions (FFA) to alleviate the shortage of attention.
Experiments on the popular benchmark dataset IWSLT demonstrate that our approach is effective.
arXiv Detail & Related papers (2022-03-23T15:29:28Z) - MHFC: Multi-Head Feature Collaboration for Few-Shot Learning [17.699793591135904]
Few-shot learning aims to address the data-scarce problem.
We propose Multi-Head Feature Collaboration (MHFC) algorithm, which attempts to project the multi-head features to a unified space.
We evaluate the proposed method on five benchmark datasets and achieve significant improvements of 2.1%-7.8% compared with state-of-the-arts.
arXiv Detail & Related papers (2021-09-16T08:09:35Z) - Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.
We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset.
In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.