Related papers: BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function

URL: http://arxiv.org/abs/2104.04197v1
Date: Fri, 9 Apr 2021 05:25:00 GMT
Title: BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function
Authors: Zhongju Wang, Long Wang, Chao Huang, Xiong Luo
Abstract summary: This paper proposes an automatic Chinese text categorization method for solving the emergency event report classification problem. To overcome the data imbalance problem in the distribution of emergency event categories, a novel loss function is proposed to improve the performance of the BERT-based model. The proposed method has achieved the best performance in terms of accuracy, weighted-precision, weighted-recall, and weighted-F1 values.
Score: 9.028459232146474
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes an automatic Chinese text categorization method for solving the emergency event report classification problem. Since bidirectional encoder representations from transformers (BERT) has achieved great success in natural language processing domain, it is employed to derive emergency text features in this study. To overcome the data imbalance problem in the distribution of emergency event categories, a novel loss function is proposed to improve the performance of the BERT-based model. Meanwhile, to avoid the impact of the extreme learning rate, the Adabound optimization algorithm that achieves a gradual smooth transition from Adam to SGD is employed to learn parameters of the model. To verify the feasibility and effectiveness of the proposed method, a Chinese emergency text dataset collected from the Internet is employed. Compared with benchmarking methods, the proposed method has achieved the best performance in terms of accuracy, weighted-precision, weighted-recall, and weighted-F1 values. Therefore, it is promising to employ the proposed method for real applications in smart emergency management systems.

Related papers

Optimizing Active Learning in Vision-Language Models via Parameter-Efficient Uncertainty Calibration [6.7181844004432385]
We introduce a novel parameter-efficient learning methodology that incorporates uncertainty calibration loss within the Active Learning framework.<n>We demonstrate that our solution can match and exceed the performance of complex feature-based sampling techniques.
arXiv Detail & Related papers (2025-07-29T06:08:28Z)
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
Preference Optimization for Combinatorial Optimization Problems [54.87466279363487]
Reinforcement Learning (RL) has emerged as a powerful tool for neural optimization, enabling models learns that solve complex problems without requiring expert knowledge.<n>Despite significant progress, existing RL approaches face challenges such as diminishing reward signals and inefficient exploration in vast action spaces.<n>We propose Preference Optimization, a novel method that transforms quantitative reward signals into qualitative preference signals via statistical comparison modeling.
arXiv Detail & Related papers (2025-05-13T16:47:00Z)
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift. Current approaches typically address this issue through online sampling from the target policy. We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck [28.661084093544684]
We propose a novel approach based on the information bottleneck (IB) principle and invariant risk minimization (IRM) framework. The proposed method aims to extract compact and informative features that possess high capability for effective domain-shift generalization. We show that the proposed scheme outperforms state-of-the-art approaches and achieves a better rate-distortion tradeoff.
arXiv Detail & Related papers (2024-05-15T17:07:55Z)
Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models [24.784439330058095]
This study investigates concerns related to neural models inadvertently retaining personal or sensitive data. A novel approach is introduced to achieve precise and selective forgetting within language models. Two innovative evaluation metrics are proposed: Sensitive Information Extraction Likelihood (S-EL) and Sensitive Information Memory Accuracy (S-MA)
arXiv Detail & Related papers (2024-02-08T16:50:01Z)
DPBERT: Efficient Inference for BERT based on Dynamic Planning [11.680840266488884]
Existing input-adaptive inference methods fail to take full advantage of the structure of BERT. We propose Dynamic Planning in BERT, a novel fine-tuning strategy that can accelerate the inference process of BERT. Our method reduces latency to 75% while maintaining 98% accuracy, yielding a better accuracy-speed trade-off compared to state-of-the-art input-adaptive methods.
arXiv Detail & Related papers (2023-07-26T07:18:50Z)
Boosting Event Extraction with Denoised Structure-to-Text Augmentation [52.21703002404442]
Event extraction aims to recognize pre-defined event triggers and arguments from texts. Recent data augmentation methods often neglect the problem of grammatical incorrectness. We propose a denoised structure-to-text augmentation framework for event extraction DAEE.
arXiv Detail & Related papers (2023-05-16T16:52:07Z)
A Novel Plagiarism Detection Approach Combining BERT-based Word Embedding, Attention-based LSTMs and an Improved Differential Evolution Algorithm [11.142354615369273]
We propose a novel method for detecting plagiarism based on attention mechanism-based long short-term memory (LSTM) and bidirectional encoder representations from transformers (BERT) word embedding. BERT could be included in a downstream task and fine-tuned as a task-specific structure, while the trained BERT model is capable of detecting various linguistic characteristics.
arXiv Detail & Related papers (2023-05-03T18:26:47Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR) Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model. We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm Deployed in Ridehailing Marketplace [12.298997392937876]
This study proposes a real-time dispatching algorithm based on reinforcement learning. It is deployed online in multiple cities under DiDi's operation for A/B testing and is launched in one of the major international markets. The deployed algorithm shows over 1.3% improvement in total driver income from A/B testing.
arXiv Detail & Related papers (2022-02-10T16:07:17Z)
Efficient falsification approach for autonomous vehicle validation using a parameter optimisation technique based on reinforcement learning [6.198523595657983]
The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved. The uncertainties in the behaviour of the traffic participants and the dynamic world cause reactions in advanced autonomous systems. This paper presents an efficient falsification method to evaluate the System Under Test.
arXiv Detail & Related papers (2020-11-16T02:56:13Z)
On Learning Text Style Transfer with Direct Rewards [101.97136885111037]
Lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task. We leverage semantic similarity metrics originally used for fine-tuning neural machine translation models. Our model provides significant gains in both automatic and human evaluation over strong baselines.
arXiv Detail & Related papers (2020-10-24T04:30:02Z)
Logistic Q-Learning [87.00813469969167]
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs. The main feature of our algorithm is a convex loss function for policy evaluation that serves as a theoretically sound alternative to the widely used squared Bellman error.
arXiv Detail & Related papers (2020-10-21T17:14:31Z)
Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production. One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs) We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.