HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition
- URL: http://arxiv.org/abs/2503.01217v1
- Date: Mon, 03 Mar 2025 06:31:52 GMT
- Title: HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition
- Authors: Sijin Sun, Ming Deng, Xinrui Yu, Liangbin Zhao,
- Abstract summary: This paper proposes HREB-CRF framework: Hierarchical Reduced-bias EMA with CRF.<n>The proposed method amplifies word boundaries and pools long text gradients through exponentially fixed-bias weighted average of local and global hierarchical attention.
- Score: 3.562368079040469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incorrect boundary division, complex semantic representation, and differences in pronunciation and meaning often lead to errors in Chinese Named Entity Recognition(CNER). To address these issues, this paper proposes HREB-CRF framework: Hierarchical Reduced-bias EMA with CRF. The proposed method amplifies word boundaries and pools long text gradients through exponentially fixed-bias weighted average of local and global hierarchical attention. Experimental results on the MSRA, Resume, and Weibo datasets show excellent in F1, outperforming the baseline model by 1.1\%, 1.6\%, and 9.8\%. The significant improvement in F1 shows evidences of strong effectiveness and robustness of approach in CNER tasks.
Related papers
- Where Relevance Emerges: A Layer-Wise Study of Internal Attention for Zero-Shot Re-Ranking [40.652380579951206]
In-Context Re-ranking (ICR) has recently been proposed as an $O(1)$ alternative method.<n>ICR extracts internal attention signals directly, avoiding the overhead of text generation.<n>No unified study has compared internal attention with traditional generative and likelihood-based mechanisms.
arXiv Detail & Related papers (2026-02-26T03:51:31Z) - ConsistentRFT: Reducing Visual Hallucinations in Flow-based Reinforcement Fine-Tuning [85.20505958752928]
Reinforcement Fine-Tuning (RFT) on flow-based models is crucial for preference alignment.<n>RFT often introduce visual hallucinations like over-optimized details and semantic misalignment.<n>This work preliminarily explores why visual hallucinations arise and how to reduce them.
arXiv Detail & Related papers (2026-02-03T11:49:46Z) - Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction [12.483998165719981]
We propose a retrieval-augmented generation framework for correcting named entity errors in automatic speech recognition (ASR)<n>Our approach consists of two key components: (1) a rephrasing language model (RLM) for named entity recognition, followed by candidate retrieval using a phonetic-level edit distance; and (2) a novel self-taught reasoning model with adaptive chain-of-thought (A-STAR) that dynamically adjusts the depth of its reasoning based on task difficulty.
arXiv Detail & Related papers (2026-01-21T15:05:39Z) - ETR: Outcome-Guided Elastic Trust Regions for Policy Optimization [6.716883192613149]
We propose textbfElastic textbfTrust textbfETR, a dynamic mechanism that aligns optimization constraints with signal quality.<n>ETR consistently outperforms GRPO, achieving superior accuracy while effectively mitigating policy entropy degradation.
arXiv Detail & Related papers (2026-01-07T09:19:53Z) - FedLoDrop: Federated LoRA with Dropout for Generalized LLM Fine-tuning [65.26899091946417]
Fine-tuning large language models (LLMs) is crucial for adapting general-purpose models to specific tasks.<n>This paper proposes Federated LoRA with Dropout (FedLoDrop), a new framework that applies dropout to the rows and columns of the trainable matrix in Federated LoRA.
arXiv Detail & Related papers (2025-10-14T02:40:45Z) - RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning [64.46921169261852]
RAG-Zeval is a novel end-to-end framework that formulates faithfulness and correctness evaluation as a rule-guided reasoning task.<n>Our approach trains evaluators with reinforcement learning, facilitating compact models to generate comprehensive and sound assessments.<n>Experiments demonstrate RAG-Zeval's superior performance, achieving the strongest correlation with human judgments.
arXiv Detail & Related papers (2025-05-28T14:55:33Z) - DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models [22.45637113673959]
Low-Rank Adaptation (LoRA) has emerged as one of the most effective, computationally tractable fine-tuning approaches for training Vision-Language Models (VLMs) and Large Language Models (LLMs)<n>This work improves the convergence rate of decentralized LoRA to match the rate of decentralized gradient SGD by ensuring smoothness.<n>We also introduce DeCAF, a novel algorithm integrating DLoRA with truncated singular value decomposition (TSVD)-based matrix factorization to resolve consensus interference.
arXiv Detail & Related papers (2025-05-27T16:10:53Z) - On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization [52.76330545825083]
Reinforcement learning (RL) has become popular in enhancing the reasoning capabilities of large language models (LLMs)<n>We identify a previously unrecognized phenomenon we term Lazy Likelihood Displacement (LLD), wherein the likelihood of correct responses marginally increases or even decreases during training.<n>We develop a method called NTHR, which downweights penalties on tokens contributing to the LLD. Unlike prior DPO-based approaches, NTHR takes advantage of GRPO's group-based structure, using correct responses as anchors to identify influential tokens.
arXiv Detail & Related papers (2025-05-24T18:58:51Z) - RAGAT-Mind: A Multi-Granular Modeling Approach for Rumor Detection Based on MindSpore [0.0]
RAGAT-Mind is a multi-granular modeling approach for Chinese rumor detection built upon the MindSpore deep learning framework.
The model integrates TextCNN for local semantic extraction, bidirectional GRU for sequential context learning, Multi-Head Self-Attention for global dependency focusing, and Bidirectional Graph Convolutional Networks (BiGCN) for structural representation of word co-occurrence graphs.
arXiv Detail & Related papers (2025-04-24T14:03:53Z) - myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging [0.0]
We introduce myNER, a novel word-level NER corpus featuring a 7-tag annotation scheme.
We also conduct a comprehensive evaluation of NER models, including Conditional Random Fields (CRF), Bidirectional LSTM (BiLSTM)-CRF, and their combinations with fastText embeddings.
Experiments reveal the effectiveness of contextualized word embeddings and the impact of joint training with POS tagging.
arXiv Detail & Related papers (2025-04-05T03:13:33Z) - Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation [36.46163240168576]
Open-vocabulary semantic segmentation (OVSS) is an open-world task that aims to assign each pixel within an image to a specific class defined by arbitrary text descriptions.<n>Recent advancements in large-scale vision-language models have demonstrated their open-vocabulary understanding capabilities.<n>This study introduces ERR-Seg, a novel framework that effectively reduces redundancy to balance accuracy and efficiency.
arXiv Detail & Related papers (2025-01-29T13:24:53Z) - Chasing Fairness in Graphs: A GNN Architecture Perspective [73.43111851492593]
We propose textsfFair textsfMessage textsfPassing (FMP) designed within a unified optimization framework for graph neural networks (GNNs)
In FMP, the aggregation is first adopted to utilize neighbors' information and then the bias mitigation step explicitly pushes demographic group node presentation centers together.
Experiments on node classification tasks demonstrate that the proposed FMP outperforms several baselines in terms of fairness and accuracy on three real-world datasets.
arXiv Detail & Related papers (2023-12-19T18:00:15Z) - Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer [50.572974726351504]
We propose C-FNT, a novel E2E model that incorporates class-based LMs into FNT.
In C-FNT, the LM score of named entities can be associated with the name class instead of its surface form.
The experimental results show that our proposed C-FNT significantly reduces error in named entities without hurting performance in general word recognition.
arXiv Detail & Related papers (2023-09-14T12:14:49Z) - Mutual Reinforcement Effects in Japanese Sentence Classification and
Named Entity Recognition Tasks [0.0]
We develop a Sentence-to-Label Generation (SLG) framework for Sentence Classification (SC) and Named Entity Recognition (NER)
Using a format converter, we unify input formats and employ a generative model to generate SC-labels, NER-labels, and associated text segments.
Results show SC accuracy increased by 1.13 points and NER by 1.06 points in SCNM compared to standalone tasks, with CM raising format accuracy from 63.61 to 100.
arXiv Detail & Related papers (2023-07-18T14:30:36Z) - Magic ELF: Image Deraining Meets Association Learning and Transformer [63.761812092934576]
This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining.
A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery.
Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
arXiv Detail & Related papers (2022-07-21T12:50:54Z) - Optimizing Two-way Partial AUC with an End-to-end Framework [154.47590401735323]
Area Under the ROC Curve (AUC) is a crucial metric for machine learning.
Recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics.
We present the first trial in this paper to optimize this new metric.
arXiv Detail & Related papers (2022-06-23T12:21:30Z) - Universal adversarial perturbation for remote sensing images [41.54094422831997]
This paper proposes a novel method combining an encoder-decoder network with an attention mechanism to verify that UAP makes the RSI classification model error classification.
The experimental results show that the UAP can make the RSI misclassify, and the attack success rate (ASR) of our proposed method on the RSI data set is as high as 97.35%.
arXiv Detail & Related papers (2022-02-22T06:43:28Z) - Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy.
We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z) - Boosting the Generalization Capability in Cross-Domain Few-shot Learning
via Noise-enhanced Supervised Autoencoder [23.860842627883187]
We teach the model to capture broader variations of the feature distributions with a novel noise-enhanced supervised autoencoder (NSAE)
NSAE trains the model by jointly reconstructing inputs and predicting the labels of inputs as well as their reconstructed pairs.
We also take advantage of NSAE structure and propose a two-step fine-tuning procedure that achieves better adaption and improves classification performance in the target domain.
arXiv Detail & Related papers (2021-08-11T04:45:56Z) - Random Partitioning Forest for Point-Wise and Collective Anomaly
Detection -- Application to Intrusion Detection [9.74672460306765]
DiFF-RF is an ensemble approach composed of random partitioning binary trees to detect anomalies.
Our experiments show that DiFF-RF almost systematically outperforms the isolation forest (IF) algorithm.
Our experience shows that DiFF-RF can work well in the presence of small-scale learning data.
arXiv Detail & Related papers (2020-06-29T10:44:08Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.