Related papers: An Automated Knowledge Mining and Document Classification System with Multi-model Transfer Learning

An Automated Knowledge Mining and Document Classification System with Multi-model Transfer Learning

URL: http://arxiv.org/abs/2106.12744v1
Date: Thu, 24 Jun 2021 03:03:46 GMT
Title: An Automated Knowledge Mining and Document Classification System with Multi-model Transfer Learning
Authors: Jia Wei Chong, Zhiyuan Chen and Mei Shin Oh
Abstract summary: Service manual documents are crucial to the engineering company as they provide guidelines and knowledge to service engineers. We propose an automated knowledge mining and document classification system with novel multi-model transfer learning approaches.
Score: 1.1852751647387592
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Service manual documents are crucial to the engineering company as they provide guidelines and knowledge to service engineers. However, it has become inconvenient and inefficient for service engineers to retrieve specific knowledge from documents due to the complexity of resources. In this research, we propose an automated knowledge mining and document classification system with novel multi-model transfer learning approaches. Particularly, the classification performance of the system has been improved with three effective techniques: fine-tuning, pruning, and multi-model method. The fine-tuning technique optimizes a pre-trained BERT model by adding a feed-forward neural network layer and the pruning technique is used to retrain the BERT model with new data. The multi-model method initializes and trains multiple BERT models to overcome the randomness of data ordering during the fine-tuning process. In the first iteration of the training process, multiple BERT models are being trained simultaneously. The best model is then selected for the next phase of the training process with another two iterations and the training processes for other BERT models will be terminated. The performance of the proposed system has been evaluated by comparing with two robust baseline methods, BERT and BERT-CNN. Experimental results on a widely used Corpus of Linguistic Acceptability (CoLA) dataset have shown that the proposed techniques perform better than these baseline methods in terms of accuracy and MCC score.

Related papers

BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI [1.4214002697449326]
Course Outcome (CO) and Program Outcome (PO)/Program-Specific Outcome (PSO) alignment is a crucial task for ensuring curriculum coherence and assessing educational effectiveness. This work demonstrates the potential of utilizing transfer learning with BERT-based models for the automated generation of Course Articulation Matrix (CAM) Our system achieves accuracy, precision, recall, and F1-score values of 98.66%, 98.67%, 98.66%, and 98.66%, respectively.
arXiv Detail & Related papers (2024-11-21T16:02:39Z)
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT. On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt. On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z)
Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks. adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations. In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z)
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed. We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z)
Training ELECTRA Augmented with Multi-word Selection [53.77046731238381]
We present a new text encoder pre-training method that improves ELECTRA based on multi-task learning. Specifically, we train the discriminator to simultaneously detect replaced tokens and select original tokens from candidate sets.
arXiv Detail & Related papers (2021-05-31T23:19:00Z)
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation [55.34995029082051]
We propose a method to learn to augment for data-scarce domain BERT knowledge distillation. We show that the proposed method significantly outperforms state-of-the-art baselines on four different tasks.
arXiv Detail & Related papers (2021-01-20T13:07:39Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
A Practical Incremental Method to Train Deep CTR Models [37.54660958085938]
We introduce a practical incremental method to train deep CTR models, which consists of three decoupled modules. Our method can achieve comparable performance to the conventional batch mode training with much better training efficiency.
arXiv Detail & Related papers (2020-09-04T12:35:42Z)
Reinforced Curriculum Learning on Pre-trained Neural Machine Translation Models [20.976165305749777]
We learn a curriculum for improving a pre-trained NMT model by re-selecting influential data samples from the original training set. We propose a data selection framework based on Deterministic Actor-Critic, in which a critic network predicts the expected change of model performance.
arXiv Detail & Related papers (2020-04-13T03:40:44Z)
TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval [11.923682816611716]
We present TwinBERT model for effective and efficient retrieval. It has twin-structured BERT-like encoders to represent query and document respectively. It allows document embeddings to be pre-computed offline and cached in memory.
arXiv Detail & Related papers (2020-02-14T22:44:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.