Efficient Joint Learning for Clinical Named Entity Recognition and
Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
- URL: http://arxiv.org/abs/2302.04185v1
- Date: Wed, 8 Feb 2023 16:44:27 GMT
- Title: Efficient Joint Learning for Clinical Named Entity Recognition and
Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
- Authors: Anthony Yazdani, Dimitrios Proios, Hossein Rouhizadeh, Douglas Teodoro
- Abstract summary: Current approaches for clinical information extraction are inefficient in terms of computational costs and memory consumption.
We propose an efficient end-to-end model, the Joint-NER-RE-Fourier (JNRF), to jointly learn the tasks of named entity recognition and relation extraction for documents of variable length.
Results show that the proposed approach trains 22 times faster and reduces GPU memory consumption by 1.75 folds, with a reasonable performance tradeoff of 90%.
- Score: 0.11470070927586018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current approaches for clinical information extraction are inefficient in
terms of computational costs and memory consumption, hindering their
application to process large-scale electronic health records (EHRs). We propose
an efficient end-to-end model, the Joint-NER-RE-Fourier (JNRF), to jointly
learn the tasks of named entity recognition and relation extraction for
documents of variable length. The architecture uses positional encoding and
unitary batch sizes to process variable length documents and uses a
weight-shared Fourier network layer for low-complexity token mixing. Finally,
we reach the theoretical computational complexity lower bound for relation
extraction using a selective pooling strategy and distance-aware attention
weights with trainable polynomial distance functions. We evaluated the JNRF
architecture using the 2018 N2C2 ADE benchmark to jointly extract
medication-related entities and relations in variable-length EHR summaries.
JNRF outperforms rolling window BERT with selective pooling by 0.42%, while
being twice as fast to train. Compared to state-of-the-art BiLSTM-CRF
architectures on the N2C2 ADE benchmark, results show that the proposed
approach trains 22 times faster and reduces GPU memory consumption by 1.75
folds, with a reasonable performance tradeoff of 90%, without the use of
external tools, hand-crafted rules or post-processing. Given the significant
carbon footprint of deep learning models and the current energy crises, these
methods could support efficient and cleaner information extraction in EHRs and
other types of large-scale document databases.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - REXEL: An End-to-end Model for Document-Level Relation Extraction and Entity Linking [11.374031643273941]
REXEL is a highly efficient and accurate model for the joint task of document level cIE (DocIE)
It is on average 11 times faster than competitive existing approaches in a similar setting.
The combination of speed and accuracy makes REXEL an accurate cost-efficient system for extracting structured information at web-scale.
arXiv Detail & Related papers (2024-04-19T11:04:27Z) - Comparison of edge computing methods in Internet of Things architectures for efficient estimation of indoor environmental parameters with Machine Learning [0.0]
Two methods are proposed to implement lightweight Machine Learning models that estimate indoor environmental quality (IEQ) parameters.
Their implementation is based on centralised and distributed parallel IoT architectures, connected via wireless.
The training and testing of ML models is accomplished with experiments focused on small temperature and illuminance datasets.
arXiv Detail & Related papers (2024-02-07T21:15:18Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Clinical Concept and Relation Extraction Using Prompt-based Machine
Reading Comprehension [38.79665143111312]
We formulate both clinical concept extraction and relation extraction using a unified prompt-based machine reading comprehension architecture.
We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction.
We evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting.
arXiv Detail & Related papers (2023-03-14T22:37:31Z) - UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed.
The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features.
Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z) - Federated Learning for Energy-limited Wireless Networks: A Partial Model
Aggregation Approach [79.59560136273917]
limited communication resources, bandwidth and energy, and data heterogeneity across devices are main bottlenecks for federated learning (FL)
We first devise a novel FL framework with partial model aggregation (PMA)
The proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets.
arXiv Detail & Related papers (2022-04-20T19:09:52Z) - Time-Correlated Sparsification for Efficient Over-the-Air Model
Aggregation in Wireless Federated Learning [23.05003652536773]
Federated edge learning (FEEL) is a promising distributed machine learning (ML) framework to drive edge intelligence applications.
We propose time-correlated sparsification with hybrid aggregation (TCS-H) for communication-efficient FEEL.
arXiv Detail & Related papers (2022-02-17T02:48:07Z) - Deep Neural Networks Based Weight Approximation and Computation Reuse
for 2-D Image Classification [0.9507070656654631]
Deep Neural Networks (DNNs) are computationally and memory intensive.
This paper introduces a new method to improve DNNs performance by fusing approximate computing with data reuse techniques.
It is suitable for IoT edge devices as it reduces the memory size requirement as well as the number of needed memory accesses.
arXiv Detail & Related papers (2021-04-28T10:16:53Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.