The Devil is in the Details: Evaluating Limitations of Transformer-based
Methods for Granular Tasks
- URL: http://arxiv.org/abs/2011.01196v1
- Date: Mon, 2 Nov 2020 18:41:32 GMT
- Title: The Devil is in the Details: Evaluating Limitations of Transformer-based
Methods for Granular Tasks
- Authors: Brihi Joshi, Neil Shah, Francesco Barbieri, Leonardo Neves
- Abstract summary: Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks.
We focus on the problem of textual similarity from two perspectives: matching documents on a granular level, and an abstract level.
We empirically demonstrate, across two datasets from different domains, that despite high performance in abstract document matching as expected, contextual embeddings are consistently (and at times, vastly) outperformed by simple baselines like TF-IDF for more granular tasks.
- Score: 19.099852869845495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextual embeddings derived from transformer-based neural language models
have shown state-of-the-art performance for various tasks such as question
answering, sentiment analysis, and textual similarity in recent years.
Extensive work shows how accurately such models can represent abstract,
semantic information present in text. In this expository work, we explore a
tangent direction and analyze such models' performance on tasks that require a
more granular level of representation. We focus on the problem of textual
similarity from two perspectives: matching documents on a granular level
(requiring embeddings to capture fine-grained attributes in the text), and an
abstract level (requiring embeddings to capture overall textual semantics). We
empirically demonstrate, across two datasets from different domains, that
despite high performance in abstract document matching as expected, contextual
embeddings are consistently (and at times, vastly) outperformed by simple
baselines like TF-IDF for more granular tasks. We then propose a simple but
effective method to incorporate TF-IDF into models that use contextual
embeddings, achieving relative improvements of up to 36% on granular tasks.
Related papers
- Hypergraph based Understanding for Document Semantic Entity Recognition [65.84258776834524]
We build a novel hypergraph attention document semantic entity recognition framework, HGA, which uses hypergraph attention to focus on entity boundaries and entity categories at the same time.
Our results on FUNSD, CORD, XFUNDIE show that our method can effectively improve the performance of semantic entity recognition tasks.
arXiv Detail & Related papers (2024-07-09T14:35:49Z) - Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models [0.8602553195689513]
Entity-Aspect Sentiment Triplet Extraction (EASTE) is a novel Aspect-Based Sentiment Analysis task.
Our research aims to achieve high performance on the EASTE task and investigates the impact of model size, type, and adaptation techniques on task performance.
Ultimately, we provide detailed insights and achieving state-of-the-art results in complex sentiment analysis.
arXiv Detail & Related papers (2024-07-04T16:48:14Z) - Contextualized Diffusion Models for Text-Guided Image and Video Generation [67.69171154637172]
Conditional diffusion models have exhibited superior performance in high-fidelity text-guided visual generation and editing.
We propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample.
We generalize our model to both DDPMs and DDIMs with theoretical derivations, and demonstrate the effectiveness of our model in evaluations with two challenging tasks: text-to-image generation, and text-to-video editing.
arXiv Detail & Related papers (2024-02-26T15:01:16Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Revisiting text decomposition methods for NLI-based factuality scoring
of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring.
We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Transformer Models for Text Coherence Assessment [14.132559978971377]
Coherence is an important aspect of text quality and is crucial for ensuring its readability.
Previous work has leveraged entity-based methods, syntactic patterns, discourse relations, and more recently traditional deep learning architectures for text coherence assessment.
We propose four different Transformer-based architectures for the task: vanilla Transformer, hierarchical Transformer, multi-task learning-based model, and a model with fact-based input representation.
arXiv Detail & Related papers (2021-09-05T22:27:17Z) - Neural Deepfake Detection with Factual Structure of Text [78.30080218908849]
We propose a graph-based model for deepfake detection of text.
Our approach represents the factual structure of a given document as an entity graph.
Our model can distinguish the difference in the factual structure between machine-generated text and human-written text.
arXiv Detail & Related papers (2020-10-15T02:35:31Z) - Document-Level Event Role Filler Extraction using Multi-Granularity
Contextualized Encoding [40.13163091122463]
Event extraction is a difficult task since it requires a view of a larger context to determine which spans of text correspond to event role fillers.
We first investigate how end-to-end neural sequence models perform on document-level role filler extraction.
We show that our best system performs substantially better than prior work.
arXiv Detail & Related papers (2020-05-13T20:42:17Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.