A Context-Aware Feature Fusion Framework for Punctuation Restoration
- URL: http://arxiv.org/abs/2203.12487v1
- Date: Wed, 23 Mar 2022 15:29:28 GMT
- Title: A Context-Aware Feature Fusion Framework for Punctuation Restoration
- Authors: Yangjun Wu, Kebin Fang, Yao Zhao
- Abstract summary: We propose a novel Feature Fusion framework based on two-type Attentions (FFA) to alleviate the shortage of attention.
Experiments on the popular benchmark dataset IWSLT demonstrate that our approach is effective.
- Score: 28.38472792385083
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: To accomplish the punctuation restoration task, most existing approaches
focused on leveraging extra information (e.g., part-of-speech tags) or
addressing the class imbalance problem. Recent works have widely applied the
transformer-based language models and significantly improved their
effectiveness. To the best of our knowledge, an inherent issue has remained
neglected: the attention of individual heads in the transformer will be diluted
or powerless while feeding the long non-punctuation utterances. Since those
previous contexts, not the followings, are comparatively more valuable to the
current position, it's hard to achieve a good balance by independent attention.
In this paper, we propose a novel Feature Fusion framework based on two-type
Attentions (FFA) to alleviate the shortage. It introduces a two-stream
architecture. One module involves interaction between attention heads to
encourage the communication, and another masked attention module captures the
dependent feature representation. Then, it aggregates two feature embeddings to
fuse information and enhances context-awareness. The experiments on the popular
benchmark dataset IWSLT demonstrate that our approach is effective. Without
additional data, it obtains comparable performance to the current
state-of-the-art models.
Related papers
- A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion
Models [58.46926334842161]
This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps.
We propose two novel objectives, the Separate loss and the Enhance loss, that reduce object mask overlaps and maximize attention scores.
Our method diverges from conventional test-time-adaptation techniques, focusing on finetuning critical parameters, which enhances scalability and generalizability.
arXiv Detail & Related papers (2023-12-10T22:07:42Z) - Fovea Transformer: Efficient Long-Context Modeling with Structured
Fine-to-Coarse Attention [17.48544285026157]
We introduce Fovea Transformer, a long-context focused transformer.
We use representations of context tokens with a progressively coarser granularity in the tree, as their distance to the query token increases.
We evaluate our model on three long-context summarization tasks.
arXiv Detail & Related papers (2023-11-13T06:24:27Z) - Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image
Anomaly Detection [13.801572236048601]
FOcus-the-Discrepancy (FOD) can simultaneously spot the patch-wise, intra- and inter-discrepancies of anomalies.
In this paper, we propose a novel AD framework: FOcus-the-Discrepancy (FOD), which can simultaneously spot the patch-wise, intra- and inter-discrepancies of anomalies.
arXiv Detail & Related papers (2023-08-06T01:30:26Z) - Co-Occurrence Matters: Learning Action Relation for Temporal Action
Localization [41.44022912961265]
We propose a novel Co-Occurrence Relation Module (CORM) that explicitly models the co-occurrence relationship between actions.
Besides the visual information, it further utilizes the semantic embeddings of class labels to model the co-occurrence relationship.
Our method achieves high multi-label relationship modeling capacity.
arXiv Detail & Related papers (2023-03-15T09:07:04Z) - FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced
Context-Aware Network [48.912196729711624]
Few-shot semantic segmentation is the task of learning to locate each pixel of a novel class in a query image with only a few annotated support images.
We propose a Feature-Enhanced Context-Aware Network (FECANet) to suppress the matching noise caused by inter-class local similarity.
In addition, we propose a novel correlation reconstruction module that encodes extra correspondence relations between foreground and background and multi-scale context semantic features.
arXiv Detail & Related papers (2023-01-19T16:31:13Z) - FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration [27.14686854704104]
We propose a Feature Fusion two-stream framework (FF2) for punctuation restoration.
Specifically, one stream leverages a pre-trained language model to capture the semantic feature, while another auxiliary module captures the feature at hand.
Without additional data, the experimental results on the popular benchmark IWSLT demonstrate that FF2 achieves new SOTA performance.
arXiv Detail & Related papers (2022-11-09T06:18:17Z) - Light Field Saliency Detection with Dual Local Graph Learning
andReciprocative Guidance [148.9832328803202]
We model the infor-mation fusion within focal stack via graph networks.
We build a novel dual graph modelto guide the focal stack fusion process using all-focus pat-terns.
arXiv Detail & Related papers (2021-10-02T00:54:39Z) - Online Multiple Object Tracking with Cross-Task Synergy [120.70085565030628]
We propose a novel unified model with synergy between position prediction and embedding association.
The two tasks are linked by temporal-aware target attention and distractor attention, as well as identity-aware memory aggregation model.
arXiv Detail & Related papers (2021-04-01T10:19:40Z) - Robust Person Re-Identification through Contextual Mutual Boosting [77.1976737965566]
We propose the Contextual Mutual Boosting Network (CMBN) to localize pedestrians.
It localizes pedestrians and recalibrates features by effectively exploiting contextual information and statistical inference.
Experiments on the benchmarks demonstrate the superiority of the architecture compared the state-of-the-art.
arXiv Detail & Related papers (2020-09-16T06:33:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.