Research on Dual Channel News Headline Classification Based on ERNIE
Pre-training Model
- URL: http://arxiv.org/abs/2202.06600v1
- Date: Mon, 14 Feb 2022 10:44:12 GMT
- Title: Research on Dual Channel News Headline Classification Based on ERNIE
Pre-training Model
- Authors: Junjie Li and Hui Cao
- Abstract summary: The proposed model improves the accuracy, precision and F1-score of news headline classification compared with the traditional neural network model.
It can perform well in the multi-classification application of news headline text under large data volume.
- Score: 13.222137788045416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The classification of news headlines is an important direction in the field
of NLP, and its data has the characteristics of compactness, uniqueness and
various forms. Aiming at the problem that the traditional neural network model
cannot adequately capture the underlying feature information of the data and
cannot jointly extract key global features and deep local features, a
dual-channel network model DC-EBAD based on the ERNIE pre-training model is
proposed. Use ERNIE to extract the lexical, semantic and contextual feature
information at the bottom of the text, generate dynamic word vector
representations fused with context, and then use the BiLSTM-AT network channel
to secondary extract the global features of the data and use the attention
mechanism to give key parts higher The weight of the DPCNN channel is used to
overcome the long-distance text dependence problem and obtain deep local
features. The local and global feature vectors are spliced, and finally passed
to the fully connected layer, and the final classification result is output
through Softmax. The experimental results show that the proposed model improves
the accuracy, precision and F1-score of news headline classification compared
with the traditional neural network model and the single-channel model under
the same conditions. It can be seen that it can perform well in the
multi-classification application of news headline text under large data volume.
Related papers
- CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification [3.821081081400729]
Current convolutional neural networks (CNNs) focus on local features in hyperspectral data.
Transformer framework excels at extracting global features from hyperspectral imagery.
This research introduces the Convolutional Meet Transformer Network (CMTNet)
arXiv Detail & Related papers (2024-06-20T07:56:51Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Bidirectional LSTM-CRF Attention-based Model for Chinese Word
Segmentation [2.3991565023534087]
We propose a Bidirectional LSTM-CRF Attention-based Model for Chinese word segmentation.
Our model performs better than the baseline methods modeling by other neural networks.
arXiv Detail & Related papers (2021-05-20T11:46:53Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - Feature Interaction based Neural Network for Click-Through Rate
Prediction [5.095988654970358]
We propose a Feature Interaction based Neural Network (FINN) which is able to model feature interaction via a 3-dimention relation tensor.
We show that our deep FINN model outperforms other state-of-the-art deep models such as PNN and DeepFM.
It also indicates that our models can effectively learn the feature interactions, and achieve better performances in real-world datasets.
arXiv Detail & Related papers (2020-06-07T03:53:24Z) - Abstractive Text Summarization based on Language Model Conditioning and
Locality Modeling [4.525267347429154]
We train a Transformer-based neural model on the BERT language model.
In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size.
The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset.
arXiv Detail & Related papers (2020-03-29T14:00:17Z) - Cross-scale Attention Model for Acoustic Event Classification [45.15898265162008]
We propose a cross-scale attention (CSA) model, which explicitly integrates features from different scales to form the final representation.
We show that the proposed CSA model can effectively improve the performance of current state-of-the-art deep learning algorithms.
arXiv Detail & Related papers (2019-12-27T07:28:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.