Transfer Learning for Information Extraction with Limited Data
- URL: http://arxiv.org/abs/2003.03064v2
- Date: Mon, 8 Jun 2020 13:56:57 GMT
- Title: Transfer Learning for Information Extraction with Limited Data
- Authors: Minh-Tien Nguyen, Viet-Anh Phan, Le Thai Linh, Nguyen Hong Son, Le
Tien Dung, Miku Hirano and Hajime Hotta
- Abstract summary: This paper presents a practical approach to fine-grained information extraction.
We first exploit BERT to deal with the limitation of training data in real scenarios.
We then stack BERT with Convolutional Neural Networks to learn hidden representation for classification.
- Score: 2.201264358342234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a practical approach to fine-grained information
extraction. Through plenty of experiences of authors in practically applying
information extraction to business process automation, there can be found a
couple of fundamental technical challenges: (i) the availability of labeled
data is usually limited and (ii) highly detailed classification is required.
The main idea of our proposal is to leverage the concept of transfer learning,
which is to reuse the pre-trained model of deep neural networks, with a
combination of common statistical classifiers to determine the class of each
extracted term. To do that, we first exploit BERT to deal with the limitation
of training data in real scenarios, then stack BERT with Convolutional Neural
Networks to learn hidden representation for classification. To validate our
approach, we applied our model to an actual case of document processing, which
is a process of competitive bids for government projects in Japan. We used 100
documents for training and testing and confirmed that the model enables to
extract fine-grained named entities with a detailed level of information
preciseness specialized in the targeted business process, such as a department
name of application receivers.
Related papers
- Leveraging Expert Models for Training Deep Neural Networks in Scarce
Data Domains: Application to Offline Handwritten Signature Verification [15.88604823470663]
The presented scheme is applied in offline handwritten signature verification (OffSV)
The proposed Student-Teacher (S-T) configuration utilizes feature-based knowledge distillation (FKD)
Remarkably, the models trained using this technique exhibit comparable, if not superior, performance to the teacher model across three popular signature datasets.
arXiv Detail & Related papers (2023-08-02T13:28:12Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets.
We propose a novel general and efficient active learning (GEAL) method in this paper.
Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Knowledge-driven Active Learning [70.37119719069499]
Active learning strategies aim at minimizing the amount of labelled data required to train a Deep Learning model.
Most active strategies are based on uncertain sample selection, and even often restricted to samples lying close to the decision boundary.
Here we propose to take into consideration common domain-knowledge and enable non-expert users to train a model with fewer samples.
arXiv Detail & Related papers (2021-10-15T06:11:53Z) - ProcK: Machine Learning for Knowledge-Intensive Processes [30.371382331613532]
ProcK (Process & Knowledge) is a novel pipeline to build business process prediction models.
Components to extract inter-linked event logs and knowledge bases from relational databases are part of the pipeline.
We demonstrate the power of ProcK by training it for prediction tasks on the OULAD e-learning dataset.
arXiv Detail & Related papers (2021-09-10T13:51:59Z) - Learning Purified Feature Representations from Task-irrelevant Labels [18.967445416679624]
We propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features extracted from task-irrelevant labels.
Our work is built on solid theoretical analysis and extensive experiments, which demonstrate the effectiveness of PurifiedLearning.
arXiv Detail & Related papers (2021-02-22T12:50:49Z) - Data-free Knowledge Distillation for Segmentation using Data-Enriching
GAN [0.0]
We propose a new training framework for performing knowledge distillation in a data-free setting.
We get an improvement of 6.93% in Mean IoU over previous approaches.
arXiv Detail & Related papers (2020-11-02T08:16:42Z) - Predicting Themes within Complex Unstructured Texts: A Case Study on
Safeguarding Reports [66.39150945184683]
We focus on the problem of automatically identifying the main themes in a safeguarding report using supervised classification approaches.
Our results show the potential of deep learning models to simulate subject-expert behaviour even for complex tasks with limited labelled data.
arXiv Detail & Related papers (2020-10-27T19:48:23Z) - Adversarial Knowledge Transfer from Unlabeled Data [62.97253639100014]
We present a novel Adversarial Knowledge Transfer framework for transferring knowledge from internet-scale unlabeled data to improve the performance of a classifier.
An important novel aspect of our method is that the unlabeled source data can be of different classes from those of the labeled target data, and there is no need to define a separate pretext task.
arXiv Detail & Related papers (2020-08-13T08:04:27Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.